Variability and accuracy in mammographic interpretation using the American College of Radiology Breast Imaging Reporting and Data System

K Kerlikowske; D Grady; J Barclay; S D Frankel; S H Ominsky; E A Sickles; V Ernster

doi:10.1093/jnci/90.23.1801

Variability and accuracy in mammographic interpretation using the American College of Radiology Breast Imaging Reporting and Data System

J Natl Cancer Inst. 1998 Dec 2;90(23):1801-9. doi: 10.1093/jnci/90.23.1801.

Authors

K Kerlikowske¹, D Grady, J Barclay, S D Frankel, S H Ominsky, E A Sickles, V Ernster

Affiliation

¹ Department of Epidemiology and Biostatistics, Department of Veterans Affairs, University of California, San Francisco 94121, USA. kerliko@itsa.ucsf.edu

PMID: 9839520
DOI: 10.1093/jnci/90.23.1801

Abstract

Background: Several studies, which were limited by their small sample size and selection of difficult cases for review, have reported substantial variability among radiologists in interpretation of mammographic examinations. We have determined, in the largest study to date, intraobserver and interobserver agreement in interpreting screening mammography and accuracy of mammography by use of the American College of Radiology Breast Imaging Reporting and Data System (BI-RADS).

Methods: The mammographic examinations were randomly selected on the basis of original mammographic interpretation and cancer outcome from 71,713 screening examinations performed by the Mobile Mammography Screening Program of the University of California, San Francisco, during the period from April 1985 through February 1995. The final sample included 786 abnormal examinations with no cancer detected, 267 abnormal examinations with cancer detected, and 1563 normal examinations. Films were read separately by two radiologists according to BI-RADS. Cancer status was determined by contacting women's physicians and by linkage to the regional Surveillance, Epidemiology, and End Results Program.

Results: There was moderate agreement between radiologists in reporting the presence of a finding when cancer was present (kappa = 0.54) and substantial agreement when cancer was not present (kappa = 0.62). Agreement was moderate in assigning one of the five assessment categories but was statistically significantly lower when cancer was present relative to when cancer was not present (kappa = 0.46 versus 0.56; two-sided P = .02). Agreement for reporting the presence of a finding and mammographic assessment was two-fold more likely for examinations with less dense breasts. Agreement was higher on repeat readings by the same radiologists than between radiologists. The sensitivity of mammography was lower with BI-RADS than with the original system for mammographic interpretation, but the positive predictive value of mammography was higher.

Conclusion: Considerable variability in interpretation of mammographic examinations exists; this variability and the accuracy of mammography are neither improved nor diminished with use of BI-RADS.

Publication types

Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Breast Neoplasms / diagnostic imaging*
Breast Neoplasms / pathology
Diagnosis, Differential
Female
Humans
Mammography*
Mass Screening
Observer Variation*
Radiology
Societies, Medical
United States

Abstract

Publication types

MeSH terms

Grants and funding