Objective: To (1) assess the intra- and inter-rater reliability of different ultrasound (US) measures of the lumbar multifidus muscle in subjects with and without chronic low back pain and (2) test 3 different ways to enhance reliability, that is, by testing different tasks, using a template, and averaging trials within or between days.
Design: Cross-sectional repeated-measures design.
Setting: Laboratory setting.
Patients: Fifteen subjects with chronic low back pain and 15 control subjects.
Methods: Subjects (n = 30) performed contralateral arm lifting and contralateral leg lifting while in the prone position. Two 7-second videos of the lumbar multifidus (from rest to contraction) were collected with and without a template (transparency) to reposition the transducer on the skin. One of the two raters repeated the testing 7 to 14 days later to assess intrarater reliability in addition to inter-rater reliability. Reliability was assessed with the generalizability theory as a framework.
Main outcome measurements: US imaging measures of the lumbar multifidus thickness were obtained in patients at rest and during standardized contractions (hereafter called primary measures) at 2 vertebral levels and on both sides. These primary measures were used to calculate different, potentially useful US parameters (hereafter called derived measures).
Results: Intrarater reliability was better than inter-rater reliability, and primary measures were more reliable than derived measures. The tasks investigated showed comparable reliability results, and the use of the transducer position template was not effective to increase reliability. Averaging the measures of 3 images increased reliability substantially.
Conclusions: Optimal reliability requires the use of a single rater and the averaging of at least 3 images per visit. In these conditions, primary measures reach acceptable levels of reliability, which was more difficult to achieve for most derived measures. Arm or leg lifting tasks showed similar reliability, and thus the arm-lifting task is recommended for comparisons with previous studies. The use of a transducer position template is not recommended.
Copyright © 2013 American Academy of Physical Medicine and Rehabilitation. Published by Elsevier Inc. All rights reserved.