Objective: To describe the process for developing interrater reliability (IRR) for the Four Habits Coding Scheme (4HCS) for a heterogeneous material as part of a randomized controlled trial.
Methods: Videotapes from 497 hospital encounters involving 71 doctors from most clinical specialties were collected. Four experienced psychology students were trained as raters. We calculated Pearson's r and the intraclass correlation (ICC) on the total score across consecutive samples of twenty videos, and Pearson's r on single videos across items in the initial coding phase.
Results: After 18h of training and one rating session, the total score Pearson's r and ICC exceeded .70 for all pairs of raters. Across items within single videos, the Pearson's r was never below 0.60 after the first 50 videos. At item and habit level Pearson's r remained unsatisfactory for some rater pairs mostly due to low variance on some items.
Conclusion: Based on the evaluation of the effect of communication skills training via a total score, IRR was satisfactory for the 4HCS as applied to heterogeneous material. However, good reliability at item level was difficult to achieve.
Practice implications: 4HCS may be used as an outcome measure for clinical communication skills in randomized controlled trials.
Copyright (c) 2010 Elsevier Ireland Ltd. All rights reserved.