Objectives: To assess the reliability of change in lumbar magnetic resonance imaging (MRI) findings evaluated retrospectively by direct comparison of images and by non-comparison.
Materials and methods: Pre-treatment and 2-year follow-up MRI was performed in 126 patients randomized to disc prosthesis surgery or non-surgical treatment. Two experienced radiologists independently evaluated progress and regress for Modic changes, disc findings, and facet arthropathy (FA) at L3/L4, L4/L5, and L5/S1, both by non-comparison and by comparison of initial and follow-up images. FA was evaluated at all levels, and other findings at non-operated levels. We calculated prevalence- and bias-adjusted kappa (PABAK) values for interobserver agreement. The impact of an adjacent prosthesis (which causes artefacts) and image evaluation method on PABAK was assessed using generalized estimating equations.
Results: Image comparison indicated good interobserver agreement on progress and regress (PABAK 0.63-1.00) for Modic changes, posterior high-intensity zone, disc height, and disc contour at L3-S1 and for nucleus pulposus signal and FA at L3/L4; and moderate interobserver agreement (PABAK 0.46-0.59) on decreasing nucleus signal and increasing FA at L4-S1. Image comparison indicated lower (but fair) interobserver agreement (PABAK 0.29) only for increasing FA at L5/S1 in patients with prosthesis in L4/L5 and/or L5/S1. An adjacent prosthesis had no overall impact on PABAK values (p ≥ 0.22). Comparison yielded higher PABAK values than non-comparison (p < 0.001).
Conclusions: Regarding changes in lumbar MRI findings over time, comparison of images can provide moderate or good interobserver agreement, and better agreement than non-comparison. An adjacent prosthesis may not reduce agreement on change for most findings.