Abstract
Objective
To compare two semiquantitative scoring systems for assessing the prevalence and severity of morphologic cartilage lesions, meniscal damage and bone marrow lesions from MRIs of knees with OA.
Methods
From participants in the Osteoarthritis Initiative, a sample of 115 knees with radiographic OA at high risk of cartilage loss, were selected based on risk factors for progression. Knee MRIs were read separately using both WORMS and BLOKS, and a subset was fed back to readers for reliability. Baseline readings were used for comparison of the two methods for inter-reader reliability as well as agreement on presence/absence and severity of MRI features at both the compartment level and finer anatomical subregion levels.
Results
Both methods had high inter-reader agreement for all features studied (kappa for WORMS 0.69 to 1.0 and for BLOKS 0.65 to 1.0). . Although the methods agreed well on presence and severity of morphological cartilage lesions (inter-method kappas from 0.66 to 0.95), BLOKS was more sensitive for full thickness defects. The two methods gave equivalent results for extent (kappa 0.74 to 0.80) and number (Spearman’s Rho = 0.85) of BMLs, and little extra information was obtained using the more complex BLOKS BML scoring. Similar results were also obtained for the common types of meniscal damage and extrusion (inter-method kappa 0.85 to 0.94), but the inclusion in BLOKS of meniscal signal abnormality and uncommon types of tear may be an advantage if these prove clinically meaningful.
In conclusion, both WORMS and BLOKS had high reliability. The two methods gave similar results in this sample for prevalence and severity of cartilage loss, bone marrow lesions and meniscal damage. Selecting between, or combining, the two methods should be based on factors such as reader effort, appropriateness for the goals of a study, and longitudinal performance.
Keywords: osteoarthritis, knee, magnetic resonance imaging, articular cartilage, bone marrow lesions, meniscus
Introduction
Magnetic Resonance Imaging (MRI) can now provide a wealth of information on knee osteoarthritis (OA) pathology, natural history and structure-pain relationships that is not obtainable using radiography[1]. Various approaches for standardized assessments of knee OA from MRI have been developed[2–4], but varying approaches may result in different findings about risk factors and progression of knee OA depending on the method used. There is little information comparing such methods and their performance in assessing the presence, severity and progression of structural findings of knee OA.
Whole Organ MR Scoring (WORMS)[4] and Boston-Leeds Osteoarthritis Knee Scoring (BLOKS)[3] are two methods for semi-quantitatively assessing structural changes of OA from knee MRIs that have been described in detail [3, 4] and have been, and will likely continue to be, widely used in clinical and epidemiological studies [5–11]. Both WORMS and BLOKS methods examine a spectrum of OA-related structural abnormalities including soft tissue, cartilage and bone in the knee at various anatomical subregion locations. While these two methods have very similar scoring schemes for some MRI abnormalities (e.g. ACL. PCL tears, synovitis/effusion, osteophytes), there are substantial differences in the way cartilage defects, subchondral bone marrow lesions (BMLs) and meniscal damage are assessed.
In this study, we investigated how differences between WORMS and BLOKS in scoring cartilage, meniscus and BMLs affect assessment of the presence, extent and severity of structural changes of OA in knees at high risk of cartilage loss, and we assessed agreement between methods and the reliability of scoring. The overall goals were to inform comparisons of findings between studies using one or the other method, to enhance the ability to pool data from the two methods and to aid in the selection of the best features of each method for studies of structural changes in knee OA. In a separate report we compare the sensitivity to detecting change and validity of the two scoring systems.
Methods
Selection Criteria
We studied participants in the Osteoarthritis Initiative (OAI), a longitudinal cohort study of biomarkers and risk factors for the development and progression of knee OA in 4796 individuals aged 45–79, either with, or at risk of developing, knee OA[12]. Starting with a convenience sample of knees that had baseline and 24 month follow-up visit MRIs and knee x-rays available on 08/11/2008, we used such known risk factors for disease progression as malalignment and bone marrow lesions, to select OA knees with a high risk of cartilage loss. We included knees with a definite osteophyte and mild or moderate tibiofemoral joint space narrowing, based on a central reading of the baseline knee radiograph (from OAI public datasets 0.2.1 and 1.2.1), and that also had one or more of the following: >2° varus or valgus malalignment based on a hip-knee-ankle angle measured from full limb radiographs[13]; a large tibiofemoral BML at baseline assessed using the BLOKS BML size score from a reading done independently of the ones done for the present study[14]; or worsening tibiofemoral JSN between baseline and 12 months based on the central reading. This yielded 99 knees, which we then supplemented with 16 additional knees from the Progression subcohort that had baseline definite osteophytes and JSN according to a screening reading performed at the OAI clinical centers, and malalignment based on a full limb radiograph, for a total of 115 knees (from 115 participants).
MR Imaging & Scoring Methods
MR images were acquired at 4 OAI clinical centers using dedicated Siemens Trio 3T scanners. Details of the acquisition protocols [15] and the WORMS[4] and BLOKS[3] scoring systems have been published.
Articular surface features (eg: cartilage, subchondral bone) are scored in 5 subregions for WORMS and 2 subregions for BLOKS in each tibiofemoral compartment, and both use 4 patellofemoral subregions. WORMS, splits each tibial plateau into 3 subregions antero-posteriorly (Figure 1A), and each femoral condyle into 2 subregions (“central” and “posterior”). BLOKS defines a single subregion for the tibia and “weight-bearing” portion of the femoral condyle. Both methods define an anterior femur/trochlear subregion which is part of the patellofemoral joint, with only slight differences in how that subregion is defined (Figure 1A and 1B). For subdividing medial and lateral sides of the femur, WORMS clearly specifies the border in the coronal plane using a line extending cranially from the lateral edge of femoral notch (Figure 1C), with the trochlear groove defined as being on the medial side of the femur. Although not clearly defined for BLOKS, that same medial-lateral boundary was used for BLOKS in this study. When lesions extend across multiple subregions, BLOKS instructs readers to assign the lesion to the “most involved” region, but in WORMS readers score each subregion based on the amount of the lesion in that subregion.
Figure 1.
Anatomical subregion locations WORMS vs. BLOKS: (a) WORMS separation of tibial and femur into anterior, central and posterior subregions, (b) BLOKS separation of femoral condyle into trochleaer and weight-bearing subregions, and (c) the medio-lateral division of the femur defined for WORMS [4], which was also used for BLOKS scoring in this study.
WORMS scores morphological lesions of cartilage thinning or focal loss in each subregions using a 7-point scale describing the areal extent of partial thickness and full thickness loss with one score (see Figure 2). In BLOKS, cartilage morphology is scored using two different approaches, a subregional or “cartilage I” score and a “cartilage II” score that describes cartilage integrity in specific locations in a predefined coronal image slice. We only present results from “cartilage I,” which scores cartilage morphology by assigning to each BLOKS subregion separate scores for the areal extent of any cartilage loss and for the percentage of the subregion surface area that has full thickness loss. Both scores are on a 4 point scale, but combined provide 10 possible combinations (see Figure 2). In this study we did not score cartilage signal abnormalities, such as a WORMS score of 1 which indicates increased signal of normal thickness cartilage on T2-weighted images.
Figure 2.
Schematic diagram of the correspondence of different possible categories of WORMS and BLOKS cartilage defect scores for the extent of any cartilage loss and the extent of full thickness cartilage loss. WORMS score descriptions (paraphrased from Peterfy, et al. 2004[4]): 0=normal thickness and signal; 1=normal thickness but increased signal on T2-weighted images (not used in this study); 2.0=partial thickness focal defect <1 cm in greatest width; 2.5=full thickness focal defect <1 cm in greatest width; 3=multiple areas of partial-thickness defects < 75% of region or a single partial thicknees defect wider than 1 cm but <75% of the region; 4=diffuse (>75% of the region) partial-thickness loss; 5=multiple areas of full thickness loss <75% of the region or a single full thickness lesion wider than 1 cm but <75% of the region; 6=diffuse (>75% of the region) full-thickness loss. BLOKS score descriptions (from Hunter, et al. 2008[3]): Size of any cartilage loss (including partial and full thickness loss) as a % of surface area as related to the size of each individual region: 0: none; 1: < 10% of region of cartilage surface area; 2: 10 – 75% of region of cartilage surface area; 3: >75% of region of cartilage surface area; AND % full thickness cartilage loss of the region: 0: none; 1: < 10% of region of cartilage surface area; 2: 10 – 75% of region of cartilage surface area; 3: >75% of region of cartilage surface area.
* BLOKS score combination [any loss score/ full thickness loss score]
** typical WORMS equivalent score
+ could also be WORMS 5 depending on exact nature of loss
++ could also be WORMS 2.5 depending on exact nature of loss
Figure 2 compares the possible scores for cartilage morphology between the two methods. Because WORMS grades 2.5 and 5 for full thickness loss do not specify the extent of any additional partial thickness loss they correspond to several BLOKS score combinations for any loss and full thickness loss. Since the smallest WORMS lesion size is defined as <1cm while in BLOKS it is defined as <10% of the subregion, the correspondence between some WORMS and BLOKS scores can vary depending on size of the subregion (e.g. BLOKS 2 /0 can correspond to a WORMS 2 or 3).
Both WORMS and BLOKS score meniscal lesions separately for the body and each horn of both menisci. WORMS meniscal damage scores are: 0=intact; 1 = minor radial or parrot beak tear; 2 = nondisplaced tear or prior surgical repair; 3 = displaced tear or partial resection; 4 = complete maceration/destruction or complete resection. BLOKS scores each meniscal subregion as intact, having a signal change, having a tear (including whether vertical, horizontal or complex) or having maceration. Important differences are that BLOKS, but not WORMS, scores signal abnormality (high signal that does not extend to a meniscal surface in 2 or more slices), BLOKS combines partial and full maceration (treated separately in WORMS) into one maceration measure, and BLOKS records root tears. BLOKS scores medio-lateral and anterior extrusion of the meniscal body on a 4 point scale (0: none, 1: < 2mm, 2: 2–5mm, 3: >5mm extruded) and although not included in the original WORMS scale, a modified version[16] assesses medio-lateral extrusion on a 3 point scale (0: none, 1: < 50% extruded, 2: >50% extruded).
For BMLs in WORMS, the score assigned (on a 4 point scale) to each of anatomic subregion reflects the volume of the subregion occupied by diffuse edema-like BML. A subregion with one large lesion can be assigned the same score as a subregion with multiple small unconnected regions of BML. BLOKS differs from WORMS in two main ways: (1) in BLOKS readers record every individual subchondral BML and whether each is ill-defined or cystic (or a combination) and (2) each lesion is scored for three separate parameters: size, proportion of lesion in contact with the subchondral bone plate and proportion of cystic lesions within the BML, with each score being a 4-point scale.
MRI Reading Procedures
Two experienced MSK radiologists (AG, FR) read MRIs from each knee paired, but blinded to order and subjects’ clinical characteristics.
For damage to the meniscal body, coronal IW TSE and coronal MPR of 3D DESS WE images were used. For the meniscal horns, sagittal IW TSE (fat suppressed) and sagittal 3D DESS WE sequences were used. In cases of uncertainty axial MPR of 3D DESS images were viewed. For BML scoring in WORMS only sagittal IW (fs) images were used. To assess the cystic part of BMLs in BLOKS coronal IW TSE, sagittal 3D DESS, and coronal MPR of DESS images were considered. For cartilage assessment all 5 sequences were taken into account depending on the subregion to be scored.
Knees were read at both time points for cartilage morphology, meniscal damage/extrusion, and BMLs in batches of 12 or 13 using one scoring method at a time, with each knee being assessed by the same reader using both methods but at different times. The reading method to be used for a batch was randomly assigned, until all batches assigned to each reader were assessed using both methods. There was an interval of at least two weeks between readings of the same knee using different methods. The number of knees assessed by each reader was roughly equal. Twenty-five knees were read by both readers for inter-rater reliability.
Analytical Strategy
Using WORMS, the presence of cartilage loss in a subregion was defined as a score >=2 in that subregion, and presence of full thickness loss was defined as a score 2.5, 5 or 6 in the subregion. For BLOKS, presence of any cartilage loss or of full thickness loss in a subregion were defined as a score of > 0 for each type, respectively. Presence/absence of cartilage loss and full thickness loss within a compartment was defined based on the highest score in any of the subregions of a given compartment. An analogous approach was used to summarize BML scores over subregions of a compartment. BLOKS and WORMS use the same anatomical subdivisions of the meniscus, so for comparison of meniscal damage, direct examination of subregion scores was used.
To test correlations between features on ordinal scales, we used Spearman’s rank coefficient. For inter-rater and inter-method agreement, we calculated kappa statistics (simple for dichotomous and categorical variables, weighted for ordinal variables).
Results
Subject Characteristics
Sixty (52%) of the 115 participants were male, 62 (54%) had a BMI ≥ 30, 41% were less than 60 years old and 26% older than 70. Based on central readings of all knee x-rays, 107/115 knees had a definite osteophyte and non-end stage joint space narrowing on x-ray; 2 had a K-L grade <2 and 6 were K-L grade 4. 28% of knees were selected based on having JSN progression from baseline to 12 months, and 66% were malaligned (varus or valgus). At baseline 34% had a large BML, and 89% had frequent pain in the knee studied.
Cartilage Morphology Scoring
Subregion Level
Only some of the score values (WORMS 0, 3 and 5; BLOKS combinations 0/0, 2/0 and 2/2) were used often (Figure 3). A higher proportion of subregions had no cartilage loss on WORMS compared to BLOKS, consistent with the greater number and smaller size of the WORMS subregions. Certain WORMS subregions had a much lower prevalence of cartilage defects than others (e.g.: there are cartilage defects in 76% of central medial tibia subregions, but in anterior and posterior medial tibia subregions there was a prevalence of <25% and in the anterior lateral tibia a prevalence of only 4%).
Figure 3.
Showing (a) the distribution (% of all tibiofemoral and patellofemoral subregions affected) of WORMS cartilage morphology scores. Grades 2, 3 and 4 are partial thickness only, and (b) the distribution (% of all tibiofemoral and patellofemoral subregions affected) of the possible combinations of BLOKS cartilage morphology scores.
For subregions with only partial thickness loss, WORMS categories (2, 3, and 4) and BLOKS categories (1/0, 2/0 and 3/0, respectively) are roughly equivalent and those categories were assigned with similar relative frequencies. Moderate areas (10–75%) of partial thickness cartilage loss with no full thickness lesion were the most frequent scores in both methods.
Small full thickness lesions were infrequently scored in both methods. A WORMS score of 2.5 for a minimal area (<1cm) of full thickness loss is described as an isolated lesion, so it is unclear as to what WORMS score should be used for subregions with a minimal area of full thickness loss, but with concomitant partial thickness loss in the same subregion (Figure 2). In contrast, BLOKS allows for identification of subregions with a minimal area (<10%) of full thickness combined with a larger total extent of any loss (2/1 and 3/1), although these were infrequently used in this sample.
Scores indicating moderate areas (WORMS: > 1cm but < 75%, BLOKS: 10% - 75%) of full thickness cartilage loss were the second most common lesion category in both WORMS (5) and BLOKS (2/2, 3/2). The BLOKS category 3/2 indicates a moderate area of full thickness loss in a subregion with extensive (>75%) overall loss. However, this combination, which is not explicitly defined in WORMS but would most likely be scored as 5, was uncommon
In the PF compartment, WORMS and BLOKS subregions are roughly equivalent making direct comparison of scores possible (Table 1). At this subregional level, agreement between methods on the absence/presence of any type of lesion was extremely high (kappa=0.93), as was agreement for presence/absence of full thickness loss (kappa =0.87). For context, within method inter-reader kappas at the subregion level for WORMS and BLOKS, respectively, were 0.96 and 0.93 for any lesion and 0.97 and 0.97 for full thickness lesions. There was also good agreement on the presence and extent of partial thickness loss with no full thickness loss. However, there were several WORMS vs. BLOKS discrepant scores that are informative. For example, WORMS scores of 3 indicating areas of partial but no full, thickness loss and scores of 5 indicating only full thickness loss were often scored in BLOKS as having areas of both partial and full thickness lesion, suggesting uncertainty when applying WORMS to areas with both full and partial thickness lesions. In addition, a small number of subregions scored normal by each method had small to moderate areas of partial thickness loss according to the other, indicating uncertainty in assessment of these small lesion.
Table 1.
Comparison of baseline WORMS and BLOKS Cartilage scores in patello-femoral subregions.
BLOKS cartilage morphology score (% of subregion with any cartilage loss/ % of subregion with full thickness loss) |
|||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
WORMS cartilage morphology score |
0/0 | 1/0 | 1/1 | 2/0 | 2/1 | 2/2 | 3/0 | 3/1 | 3/2 | 3/3 | N of subegions |
0: normal thickness and signal | 162 | 2 | 0 | 7 | 1 | 0 | 0 | 0 | 0 | 0 | 172 |
2: solitary partial thickness focal defect <1 cm width |
1 | 7 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 12 |
2.5: solitary full thickness focal defect <1 cm width |
0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
3: multiple areas of partial thickness defect intermixed with areas of normal thickness |
5 | 1 | 0 | 169 | 5 | 7 | 1 | 0 | 1 | 0 | 189 |
4: diffuse (>75% of region) partial thickness loss |
0 | 0 | 0 | 2 | 0 | 0 | 12 | 0 | 0 | 0 | 14 |
5: areas of full thickness loss >1 cm and <75% of the region |
0 | 0 | 0 | 3 | 1 | 42 | 0 | 0 | 11 | 3 | 60 |
6: diffuse (>75% of region) full thickness loss |
0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 2 | 9 | 12 |
N of subregions | 168 | 10 | 0 | 185 | 8 | 50 | 13 | 0 | 14 | 12 | 460 |
Compartment and Knee Level
Overall, 112/113 knees (99%) had prevalent cartilage defects by one or the other scoring method. Full thickness loss was detected in 79/113 knees (70%) using WORMS and 88/113 knees (78%) using BLOKS. Similarly, at the compartment level, the prevalence of any cartilage defect was nearly identical using the two methods, but full thickness loss was 15–25% more frequent when scored with BLOKS when compared to WORMS.
Table 2 shows agreement between the two methods at the compartment level for the presence of any type of cartilage defect, and for the presence of full thickness loss. Agreement was high for any cartilage loss (kappas 0.78–0.95) but slightly less good for presence/absence of any full thickness loss (kappas 0.66–0.86). The tendency for readers to score more full thickness lesions with BLOKS than WORMS was seen in all 3 compartments. Inter-rater agreement for each method is also shown in table 2 for context; kappas were generally higher for inter-reader than inter-method agreement.
Table 2.
Agreement between WORMS and BLOKS on presence of any lesion (left) and full thickness lesions (right) at the compartment level. Cohen’s kappa (κ) is given for each crosstabulation.
Presence of any lesion (WORMS >=2 vs BLOKS size score >0) |
Presence of full thickness lesion (WORMS 2.5, 5 or 6 vs BLOKS full Thickness score > 0) |
||||||||
---|---|---|---|---|---|---|---|---|---|
Medial Tibio-femoral compartment | |||||||||
WORMS | WORMS | ||||||||
No | Yes | No | Yes | ||||||
BLOKS | No | 8 | 2 | 10 | BLOKS | No | 65 | 0 | 65 |
Yes | 1 | 102 | 103 | Yes | 8 | 40 | 48 | ||
9 | 104 | 113 | 73 | 40 | 113 | ||||
κ=0.83 [95% CI 0.64–1.00] | κ=0.85 [95% CI 0.74–0.95] | ||||||||
Inter-rater | |||||||||
WORMS | κ=1.0 [95% CI 1.0–1.0] | κ=0.92[95% CI 0.58–1.0] | |||||||
BLOKS | κ=1.0 [95% CI 1.0-1.0] | κ=0.68[95% CI 0.39-0.96] | |||||||
Lateral tibio-femoral compartment | |||||||||
WORMS | WORMS | ||||||||
No | Yes | No | Yes | ||||||
BLOKS | No | 33 | 6 | 39 | BLOKS | No | 76 | 4 | 80 |
Yes | 5 | 69 | 74 | Yes | 11 | 22 | 33 | ||
38 | 75 | 113 | 87 | 26 | 113 | ||||
κ=0.78[95% CI0.66-0.90] | κ=0.66[95% CI 0.50-0.82] | ||||||||
Inter-rater | |||||||||
WORMS | κ=0.88 [95% CI 0.66-1.0] | κ=0.69[95% CI 0.36-1.0] | |||||||
BLOKS | κ=1.0 [95% CI 1.0-1.0] | κ=1.0 [95% CI 1.0-1.0] | |||||||
Patello-femoral compartment | |||||||||
WORMS | WORMS | ||||||||
No | Yes | No | Yes | ||||||
BLOKS | No | 10 | 1 | 11 | BLOKS | No | 61 | 1 | 62 |
Yes | 0 | 102 | 102 | Yes | 7 | 44 | 51 | ||
10 | 103 | 113 | 68 | 45 | 113 | ||||
κ=0.95[95% CI 0.85-1.00] | κ=0.86[95%CI 0.76-0.95] | ||||||||
Inter-rater | |||||||||
WORMS | κ=1.0 [95% CI 1.0-1.0] | κ=0.92 [95% CI 0.76-1.0] | |||||||
BLOKS | κ=1.0 [95% CI 1.0=1.0] | κ=0.92 [95% CI 0.77-1.0] |
Meniscal Damage
Table 3 compares the scores assigned to all subregions (anterior, body, posterior) of the medial meniscus. Signal abnormality, not scored in WORMS, was scored using BLOKS in 22% (40/184) of subregions. Of the six types of tear identified by BLOKS, horizontal tears were common and scored as WORMS grades 1 or 2, while complex tears were somewhat less common and generally given a WORMS grade 3 (displaced tear or partial resection). In BLOKS the end stage of ‘maceration’ (any loss of normal morphological appearance) comprised over 70% of lesions; these were nearly always scored as WORMS grade 3 (displaced tear/partial resection) and rarely as grade 4 (complete maceration/ destruction/ resection). Results for the lateral meniscus (not shown) were similar, but with much lower lesion prevalence.
Table 3.
Comparison of BLOKS and WORMS scores in subregions of the medial meniscus. The dashed lines represent division of the categorical BLOKS scale into different severities, and the collapsing of the WORMS scale into a similar 3-point scale. Agreement between these 3-point scales was high [weighted kappa=0.93 [95% CI: 0.90–0.97].
BLOKS meniscal tear category (medial meniscus)* | |||||||
---|---|---|---|---|---|---|---|
WORMS meniscal tear score (medial meniscus) |
Normal | Signal abnormality |
Horizontal tear |
Vertical tear |
Complex tear |
Macerated | Total |
Grade 0: intact | 144 | 40 | 1 | 0 | 0 | 4 | 189 |
Grade 1: minor radial tear or parrot-beaktear |
0 | 0 | 5 | 0 | 0 | 0 | 5 |
Grade 2: non-displaced tear or prior surgical repair |
2 | 0 | 26 | 0 | 2 | 2 | 32 |
Grade 3: displaced tear or partial Resection |
1 | 1 | 0 | 3 | 8 | 101 | 114 |
Grade 4: complete maceration/ destruction/resection |
0 | 0 | 0 | 0 | 0 | 4 | 4 |
Total | 147 | 41 | 32 | 3 | 10 | 111 | 344 |
Not included in this table is one posterior horn root tear (scored on BLOKS) which was scored as “Grade 2: non-displaced tear” on WORMS
There was high agreement between the methods at the subregion level for presence/absence of any morphological damage (counting signal abnormality as no damage) in the medial meniscus (kappa=0.94 [95% CI 0.90–0.98]) and in the lateral meniscus (kappa=0.85 [95% CI 0.76–0.94]). This was somewhat higher than inter-reader reliability for presence/absence at the subregion level for WORMs (0.85 [0.77–0.94]) and for BLOKS (0.65 [0.54–0.76]).
The WORMS meniscus score is intended as a scale for severity of damage, while BLOKS does not explicitly assess severity. Nevertheless, there was a high degree of concordance (weighted kappa=0.93 [95% CI: 0.90–0.97]) between them when each was converted to a 3-level scale using the grouping shown by dashed lines in Table 3. Inter-reader agreement was similar for these 3-level scales, with weighted kappas of 0.91 (0.85–0.97) for WORMS and 0.86 (0.77–0.95) for BLOKS.
WORMS and BLOKS scores for extrusion of the medial meniscal body agreed well: simple kappa=0.89 [95% CI 0.74–1.00] for presence/absence of extrusion. BLOKS Grade 3 (> 5mm extruded), rarely used, was always scored as WORMS Grade 2 (>50% extruded). Inter-reader agreement on the grade of medial meniscal extrusion was high (WORMS: weighted kappa=0.81 [95% CI 0.60–1.00], BLOKS weighted kappa=0.75 [95% CI 0.50–1.00]. Results were similar for the lateral meniscus.
Anterior extrusion, not scored in WORMS, was rarely scored present in BLOKS without medio-lateral extrusion also being present. When anterior extrusion was scored on its own, it was always minor (Grade 1: < 2mm). This feature also had poor inter-reader reliability for presence/absence in the medial meniscus (simple kappa=0.48 [0.13–0.83].
Bone Marrow Lesions
Table 4 shows good agreement (weighted kappas 0.74 to 0.80) within compartments between the highest WORMS subregion BML score (% of any subregion occupied by all BMLs) and the BLOKS score for the largest individual BML (% of any subregion occupied by a single BML). This was similar to the level of inter-reader agreement for each method.
Table 4.
Associations for each compartment of the knee, between the BLOKS BML Size score of the largest BML in a compartment with the maximum subregion WORMS BML score in the same compartment, along with weighted kappa (κ) [95% CI] for the agreement between the WORMS score and BLOKS score. Inter-reader kappas for the same features are also presented.
Medial Tibio-Femoral Compartment | Lateral Tibio-Femoral Compartment | Patello-Femoral Compartment | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
WORMS BML Score (%of sub- region) |
BLOKS BML Size Score | BLOKS BML Size Score | BLOKS BML Size Score | ||||||||||||
(% of subregion) | (% of subregion) | (% of subregion) | |||||||||||||
0: None |
1: <10% |
2: 10–25% |
3: >25% |
0: None |
1: <10% |
2: 10–25% |
3: >25% |
0: None |
1: <10% |
2: 10–25% |
3: >25% |
||||
0 (None) |
29 | 4 | 2 | 0 | 69 | 1 | 4 | 1 | 47 | 2 | 1 | 0 | |||
1 (<25%) |
3 | 16 | 6 | 0 | 2 | 16 | 4 | 1 | 2 | 9 | 6 | 0 | |||
2 (25–50%) |
0 | 3 | 21 | 10 | 1 | 1 | 11 | 2 | 0 | 2 | 14 | 12 | |||
3 (>50%) |
0 | 0 | 4 | 17 | 0 | 0 | 0 | 2 | 0 | 1 | 3 | 16 | |||
weighted κ=0.76 [0.69–0.84] | weighted κ=0.74 [0.62–0.86] | weighted κ=0.80 [0.72–0.86] | |||||||||||||
Inter-Rater | Inter-Rater | Inter-Rater | |||||||||||||
WORMS | Weighted κ=0.78 [0.63–0.93] | WORMS | weighted κ=0.80 [0.57–1.00] | WORMS | weighted κ=0.94 [0.85–1.00] | ||||||||||
BLOKS | weighted κ=0.79 [0.61–0.97] | BLOKS | weighted κ=0.92 [0.82–1.00] | BLOKS | weighted κ=0.74 [0.56–0.92] |
NB: table does not include BMLs in the subspinous region.
Of the BMLs scored in BLOKS, 56 (11%) were <10% edema (Table 5) and might not be scored as BMLs in WORMS. Repeating the analyses in Table 4 excluding these largely cystic BMLs had essentially no effect on the agreement between methods. For small (size score=l) BLOKS BMLs, 70% were almost completely comprised of edema. For large BMLs (size score=3), the majority were a combination of both edema and cyst. BLOKS scores for adjacency of BMLs to the subchondral plate was largely redundant with the BML size score (Table 5).
Table 5.
Showing (top) the association between the size of a BML and the amount of the BML that is edema (as opposed to cystic), and (bottom) the size of a BML and the amount of the BML that is adjacent to the subchondral plate for each of the 512 lesions scored using BLOKS.
BLOKS BML % Lesions Edema Score | |||||||
---|---|---|---|---|---|---|---|
BLOKSBML Size Score |
0: None | 1: <10% | 2: 10–85% | 3: >85% | |||
1: < 10% | 17 (7%) | 23 (10%) | 30 (13%) | 159 (70%) | 229(100%) | ||
2: 10–25% | 5 (3%) | 9 (5%) | 68 (37%) | 104(56%) | 186 (100%) | ||
3: > 25% | 0 (0%) | 2 (2%) | 56 (56%) | 39 (40%) | 97 (100%) | ||
BLOKS BML Adjacency Score | |||||||
BLOKSBML Size Score |
0: None | 1: <10% | 2: 10–25% | 3: >25% | |||
1: < 10% | 3 (1%) | 226 (99%) | 0 (0%) | 0 (0%) | 229 (100%) | ||
2: 10–25% | 0 (0%) | 12 (6%) | 173 (93%) | 1 (1%) | 186 (100%) | ||
3: > 25% | 0 (0%) | 1 (1%) | 9 (9%) | 87 (90%) | 97 (100%) |
NB: BLOKS BML Size score=0 is not included because that score means no BML was present.
We compared the number of BMLs present in a knee according to BLOKS with the number of WORMS subregions in the knee in which at least one BML was present. We found a strong association between these two parameters (Spearman’s Rho = 0.85). We also calculated a summary measure of BML severity within each knee for both methods. For BLOKS we used the sum of BML size scores for all BMLs in a knee and for WORMS we used the sum of the BML size scores for all the subregions of the knee. These two measures were strongly associated (Spearman’s Rho = 0.88).
Discussion
For cartilage, meniscus and subchondral BMLs, key features for which WORMS and BLOKS are substantively different, we found no major differences between the methods in assessing the presence, extent and severity of those structural changes of OA. In addition, Inter-rater reliability for all features was high for both scoring methods, with the exception of scoring anterior meniscal extrusion in BLOKS. Our results suggest that selecting between, or combining, the approaches of each method may be largely influenced by practical factors such as differences in reader effort, appropriateness for the goals of a specific study, and longitudinal performance.
Cartilage Morphology Scoring
While BLOKS assesses partial and full thickness cartilage loss separately, the WORMS approach of combining the extent of full thickness and partial thickness loss into a single score potentially results in a scale that is not ordinal. Having 2 separate scores for cartilage loss in BLOKS avoids the problems raised by WORMS such as whether a WORMS grade 2.5 (small focal full thickness loss) is more or less severe (or simply a different type of damage), than grade 3 or 4 (larger partial thickness loss only. Also with WORMS there is no obvious way to score a subregion with >10% partial thickness loss which has developed a tiny amount of full thickness loss too small to be considered Grade 5, but the degree of partial thickness loss is too large for it to truly be considered an isolated focal lesion (as Grade 2.5 is defined).
Despite such differences, we found that the overall picture of cartilage loss did not differ much between the methods: agreement was good for presence of cartilage defects at the subregion, compartment and knee level (Tables 1 and 2). WORMS cartilage scores allowed differentiation of the presence of partial thickness only loss from full thickness loss within both subregions and compartments, with good agreement with BLOKS (see Table 2), although BLOKS did show a slightly higher prevalence of full thickness loss. This result suggests that in WORMS, when a subregion is mixed between small full thickness loss and more extensive partial thickness loss, and therefore has no well defined WORMS grade, readers may tend to score the extent of partial thickness loss as the predominant type of damage in a given subregion.
Both methods allow for scoring of small focal areas of either partial or full thickness loss, but this was uncommon in the knees we studied, which is likely due to our selection of knees with mild or moderate JSN. Other studies that have included knees with less severe radiographic disease have found a higher prevalence of small focal defects, suggesting these may be a precursor of more extensive cartilage damage[17]. On the other hand, very extensive (>75% of area) areas of either partial or full thickness loss were also uncommon, consistent with our selection of knees that were not endstage.
Meniscal Damage Scoring
Despite the focus of BLOKS on the presence/absence of specific types of meniscal tear and the focus of WORMS on scaling meniscal damage by severity, there appeared to be very good agreement between assessments of meniscal damage using WORMS and BLOKS. Further, we found that BLOKS categories of meniscal tear type can be mapped into a severity ranking that closely agrees with the meniscal damage severity scaling of WORMS.
Signal abnormalities assessed using BLOKS, but not included in WORMS, were seen in about one quarter of otherwise normal menisci in this study. If these common signal abnormalities prove to be clinically meaningful, their scoring in BLOKS would be advantageous. Similarly, recording uncommon types of tears such as vertical tears and root tears, as done in BLOKS, would be an advantage in large-scale studies if these proved to be clinically important. Further study of these meniscal abnormalities is needed. BLOKS does not differentiate partial from complete maceration of the meniscus, as is done in WORMS, but the latter was uncommon in our study. It may be more common in knees with more severe OA.
Meniscal extrusion scored medio-laterally showed good agreement between BLOKS and modified WORMS for both the presence/absence of extrusion and the severity of the extrusion. The WORMS extrusion scale may be preferred because it is simpler and does not require measurements. WORMS does not score anterior extrusion, but this was rarely seen without concomitant medio-lateral extrusion, and its presence was scored with poor inter-reader reliability.
Bone Marrow Lesion Scoring
BLOKS scoring of BMLs is far more complex and time-consuming than WORMS. We did not identify any clear differences in results or advantages compared to WORMS that would justify the added complexity and effort. BLOKS scores not only BML size, but provides a count of the number of BMLs within a knee, but this number was highly associated with the number of subregions scored as having diffuse edema-like BMLs in WORMS, suggesting that the simpler WORMS subregion count provides equivalent data to the number of BMLs. We also found a strong association between the sum of WORMS BML size scores over all subregions of the knee and the sum of BLOKS BML size score over all individual BMLs. For each compartment there was high agreement between the highest WORMS BML score and the size of the largest BLOKS BML (Table 4), suggesting that the smaller percentages used to define the BLOKS categories compared to WORMS largely compensate for the much larger BLOKS subregions.
We found that the BLOKS score for adjacency of BMLs to the subchondral plate was largely redundant with the BML size score and probably adds little or no information (Table 5). BLOKS, but not WORMS, assesses each BML for the extent of area that is a cystic lesion. However, BLOKS identified few primarily cystic BMLs and excluding these did not alter the high level of agreement between the methods.
Limitations
Our study has several limitations. We did not include all published methods for scoring knee OA from MRIs (e.g. [2]), but rather focused on the two most widely used, and did not include methods that have not been as extensively described or applied in epidemiological and clinical studies. We studied knees with definite OA and some JSN that was not ‘end-stage’ and that were at high risk of cartilage loss. Most knees already had significant structural changes. Our results may not apply to knees with other clinical and radiographic characteristics and we cannot comment on any differences between the two methods in assessing MRI features of either early OA or endstage disease. The prevalence of abnormalities exhibited in our sample is also likely to be higher than in a representative sample of Kellgren and Lawrence Grade 2 or 3 knees. The ‘cartilage II’ instrument of BLOKS was not considered in our analyses as WORMS does not offer a comparable assessment of a single-slice scoring at defined locations. The utility of this streamlined approach looking at only defined locations and not subregions needs to be explored.
Conclusion
For scoring cartilage morphology, BMLS, and meniscal damage and medio-lateral meniscal extrusion, WORMS and BLOKS appeared to provide equivalent information and both methods had high levels of inter-reader realibity. BLOKS scoring of BMLs is much more complex than WORMs and this adds substantial effort to reading. Our results do not support the value of this additional effort. On the other hand, the greater number of articular surface and subchondral bone subregions in WORMS compared to BLOKS means more effort in using the former. While none of our analyses specifically addressed the added value of the finer grained anatomical subregions in WORMS scoring, it could provide more useful data on spatial interrelationships between the various structural changes in knee OA than the much larger swaths of joint tissue contained in the BLOKS subregions. Newly published data[18, 19] suggest that co-location of cartilage defects, bone marrow edema and meniscal damage may provide insight about the pathophysiology and natural history of knee OA.
In summary, our results suggest that in cross-sectional studies of the prevalence and severity of cartilage loss, bone marrow lesions and meniscal damage, WORMS and BLOKS are likely to give equivalent results in similar cohorts of patients. BLOKS provides potentially useful additional information on types of meniscal lesions. Analyses comparing the two methods for assessing structural progression of knee OA and relationships with risk factors for progression of knee OA are in an accompanying article.
Acknowledgements
The authors wish to thank members of the OAI Publications Committee who reviewed and provided comments and advice in the preparation of this manuscript. This manuscript has received the approval of the OAI Publications Committee based on a review of its scientific content and data interpretation.
Acknowledgement and Funding Sources
The work performed for this study was funded by the Osteoarthritis Initiative (OAI). The OAI is a public-private partnership comprised of five contracts (N01-AR-2-2258; N01-AR-2-2259; N01-AR-2-2260; N01-AR-2-2261; N01-AR-2-2262) funded by the National Institutes of Health, a branch of the Department of Health and Human Services, and conducted by the OAI Study Investigators. Private funding partners include Merck Research Laboratories; Novartis Pharmaceuticals Corporation, GlaxoSmithKline; and Pfizer, Inc. Private sector funding for the OAI is managed by the Foundation for the National Institutes of Health. Additional support came from NIH AR47785 and NIH AR051568. This manuscript has received the approval of the OAI Publications Committee based on a review of its scientific content and data interpretation.
The study sponsors had no involvement in the study design, the collection, analysis and interpretation of the data, or in the writing or submission of this manuscript.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Author Contributions
All authors made substantial contributions to either the study design, acquisition of data or analysis of the results, as well as to the writing of this article and approval of the final version. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.
Conflict of Interest
Ali Guermazi is president of Boston Imaging Core Lab, LLC (BICL), a company providing radiological image assessment services. He is shareholder of Synarc, Inc, a consultant to Merck Serono, Facet Solutions, Stryker, and Genzyme, and has grant support from GE Healthcare. Frank Roemer is a shareholder of BICL
None of the remaining authors have declared any possible conflict of interest.
Contributor Information
John A. Lynch, University of California at San Francisco, San Francisco, CA.
Frank W. Roemer, Klinkum Augsburg, Augsburg, Germany.
Michael C. Nevitt, University of California at San Francisco, San Francisco, CA.
David T. Felson, Boston University School of Medicine, Boston, MA.
Jingbo Niu, Boston University School of Medicine, Boston, MA.
Charles B Eaton, Memorial Hospital of Rhode Island, Pawtucket, RI.
Ali Guermazi, Boston University School of Medicine, Boston, MA.
References
- 1.Wenham CYJ, Conaghan PG. Imaging the painful osteoarthritic knee joint: what have we learned? Nature Clinical Practice Rheumatology. 2009;5:149–158. doi: 10.1038/ncprheum1023. [DOI] [PubMed] [Google Scholar]
- 2.Kornaat PR, Ceulemans RY, Kroon HM, Riyazi N, Kloppenburg M, Carter WO, et al. MRI assessment of knee osteoarthritis: Knee Osteoarthritis Scoring System (KOSS)--inter-observer and intra-observer reproducibility of a compartment-based scoring system. Skeletal Radiol. 2005;34:95–102. doi: 10.1007/s00256-004-0828-0. [DOI] [PubMed] [Google Scholar]
- 3.Hunter DJ, Lo GH, Gale D, Grainger AJ, Guermazi A, Conaghan PG. The reliability of a new scoring system for knee osteoarthritis MRI and the validity of bone marrow lesion assessment: BLOKS (Boston-Leeds Osteoarthritis Knee Score) Annals of the Rheumatic Diseases. 2008;67:206–211. doi: 10.1136/ard.2006.066183. [DOI] [PubMed] [Google Scholar]
- 4.Peterfy CG, Guermazi A, Zaim S, Tirman PFJ, Miaux Y, White D, et al. Whole-organ magnetic resonance imaging score (WORMS) of the knee in osteoarthritis. Osteoarthritis and Cartilage. 2004;12:177–190. doi: 10.1016/j.joca.2003.11.003. [DOI] [PubMed] [Google Scholar]
- 5.Amin S, LaValley MP, Guermazi A, Grigoryan M, Hunter DJ, Clancy M, et al. The relationship between cartilage loss on magnetic resonance imaging and radiographic progression in men and women with knee osteoarthritis. Arthritis and Rheumatism. 2005;52:3152–3159. doi: 10.1002/art.21296. [DOI] [PubMed] [Google Scholar]
- 6.Hunter DJ, Zhang YQ, Niu JB, Tu XH, Amin S, Goggins J, et al. Structural factors associated with malalignment in knee osteoarthritis: The Boston Osteoarthritis Knee Study. Journal of Rheumatology. 2005;32:2192–2199. [PubMed] [Google Scholar]
- 7.Carbone LD, Nevitt MC, Wildy K, Barrow KD, Harris F, Felson D, et al. The relationship of antiresorptive drug use to structural findings and symptoms of knee osteoarthritis. Arthritis and Rheumatism. 2004;50:3516–3525. doi: 10.1002/art.20627. [DOI] [PubMed] [Google Scholar]
- 8.Roemer FW, Guermazi A, Hunter DJ, Niu J, Zhang Y, Englund M, et al. The association of meniscal damage with joint effusion in persons without radiographic osteoarthritis: the Framingham and MOST osteoarthritis studies. Osteoarthritis and Cartilage. 2009;17:748–753. doi: 10.1016/j.joca.2008.09.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Roemer FW, Neogi T, Nevitt MC, Felson DT, Zhu Y, Zhang Y, et al. Subchondral bone marrow lesions are highly associated with, and predict subchondral bone attrition longitudinally: the MOST study. Osteoarthritis and Cartilage. 2010;18:47–53. doi: 10.1016/j.joca.2009.08.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Neogi T, Felson D, Niu J, Lynch J, Nevitt M, Guermazi A, et al. Cartilage Loss Occurs in the Same Subregions as Subchondral Bone Attrition: A Within-Knee Subregion-Matched Approach From the Multicenter Osteoarthritis Study. Arthritis & Rheumatism-Arthritis Care & Research. 2009;61:1539–1544. doi: 10.1002/art.24824. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Felson DT, Niu J, Guermazi A, Roemer F, Aliabadi P. Correlation of the development of knee pain with enlarging bone marrow lesions on magnetic resonance imaging. Arthritis and Rheumatism. 2007;56:2986–2992. doi: 10.1002/art.22851. [DOI] [PubMed] [Google Scholar]
- 12.Felson DT, Nevitt MC. Epidemiologic studies for osteoarthritis: new versus conventional study design approaches. Rheum Dis Clin North Am. 2004;30:783–797. doi: 10.1016/j.rdc.2004.07.005. vii. [DOI] [PubMed] [Google Scholar]
- 13.Felson DT, Cooke TD, Niu J, Goggins J, Choi J, Yu J, et al. Can anatomic alignment measured from a knee radiograph substitute for mechanical alignment from full limb films? Osteoarthritis Cartilage. 2009;17:1448–1452. doi: 10.1016/j.joca.2009.05.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Lo GH, Hunter DJ, Zhang YQ, McLennan CE, LaValley MP, Kiel DP, et al. Bone marrow lesions in the knee are associated with increased local bone density. Arthritis and Rheumatism. 2005;52:2814–2821. doi: 10.1002/art.21290. [DOI] [PubMed] [Google Scholar]
- 15.Peterfy CG, Schneider E, Nevitt M. The osteoarthritis initiative: report on the design rationale for the magnetic resonance imaging protocol for the knee. Osteoarthritis Cartilage. 2008;16:1433–1441. doi: 10.1016/j.joca.2008.06.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Englund M, Guermazi A, Roemer FW, Aliabadi P, Yang M, Lewis CE, et al. Meniscal Tear in Knees Without Surgery and the Development of Radiographic Osteoarthritis Among Middle-Aged and Elderly Persons The Multicenter Osteoarthritis Study. Arthritis and Rheumatism. 2009;60:831–839. doi: 10.1002/art.24383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Roemer FW, Zhang YQ, Niu JB, Lynch JA, Crema MD, Marra MD, et al. Tibiofemoral Joint Osteoarthritis: Risk Factors for MR-depicted Fast Cartilage Loss over a 30-month Period in the Multicenter Osteoarthritis Study. Radiology. 2009;252:772–780. doi: 10.1148/radiol.2523082197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Hernandez-Molina G, Neogi T, Hunter DJ, Niu J, Guermazi A, Reichenbach S, et al. The association of bone attrition with knee pain and other MRI features of osteoarthritis. Annals of the Rheumatic Diseases. 2008;67:43–47. doi: 10.1136/ard.2007.070565. [DOI] [PubMed] [Google Scholar]
- 19.Roemer FW, Guermazi A, Javaid MK, Lynch JA, Niu J, Zhang Y, et al. Change in MRI-detected subchondral bone marrow lesions is associated with cartilage loss: the MOST Study A longitudinal multicentre study of knee osteoarthritis. Annals of the Rheumatic Diseases. 2009;68:1461–1465. doi: 10.1136/ard.2008.096834. [DOI] [PMC free article] [PubMed] [Google Scholar]