Background: Although the UK and Australia national breast screening programs have regarded PERFORMS and BREAST test-set strategies as possible methods of estimating readers' clinical efficacy, the relationship between test-set and real life performance results has never been satisfactorily understood.
Methods: Forty-one radiologists from BreastScreen New South Wales participated in this study. Each reader interpreted a BREAST test-set which comprised sixty de-identified mammographic examinations sourced from the BreastScreen Digital Imaging Library. Spearman's rank correlation coefficient was used to compare the sensitivity measured from the BREAST test-set with screen readers' clinical audit data.
Results: Results shown statistically significant positive moderate correlations between test-set sensitivity and each of the following metrics: rate of invasive cancer per 10 000 reads (r=0.495; p < 0.01); rate of small invasive cancer per 10 000 reads (r=0.546; p < 0.001); detection rate of all invasive cancers and DCIS per 10 000 reads (r=0.444; p < 0.01).
Conclusion: Comparison between sensitivity measured from the BREAST test-set and real life detection rate demonstrated statistically significant positive moderate correlations which validated that such test-set strategies can reflect readers' clinical performance and be used as a quality assurance tool. The strength of correlation demonstrated in this study was higher than previously found by others.