Six radiologists used continuous scales to rate 529 chest-film cases for likelihood of five different types of abnormalities (interstitial disease, nodule, pneumothorax, alveolar infiltrate, and rib fracture) in each of six replicated readings, yielding 36 separate ratings of each case for the five abnormalities. Separate data analyses of all cases and subsets of the difficult/subtle cases for each abnormality estimated the relative gains in accuracy (linear-scaled area below the ROC curve) obtained by averaging the case-ratings across (a) six independent replications by each reader (25% gain), (b) six different readers within each replication (34% gain), or (c) all 36 readings (48% gain). Although accuracy differed among both readers and abnormalities, ROC curves for the median ratings showed similar relative gains in accuracy, somewhat greater than those predicted from the measured rating correlations. A model for variance components in the observer's latent decision variable could predict these gains from measured correlations in the single ratings of cases. Depending on whether the model's estimates were based on realized accuracy gains or on rating correlations, about 48% or 39% of each reader's total decision variance (summed variance for positive and negative cases) consisted of random (within-reader) error that was uncorrelated between replications, another 10% or 14% came from idiosyncratic responses to individual cases, and about 43% or 47% was systematic variation that all readers found in the sampled cases.