Aim: To compare the performance of Australian and Singapore breast readers interpreting a single test-set that consisted of mammographic examinations collected from the Australian population. Background: In the teleradiology era, breast readers are interpreting mammographic examinations from different populations. The question arises whether two groups of readers with similar training backgrounds, demonstrate the same level of performance when presented with a population familiar only to one of the groups. Methods: Fifty-three Australian and 15 Singaporean breast radiologists participated in this study. All radiologists were trained in mammogram interpretation and had a median of 9 and 15 years of experience in reading mammograms respectively. Each reader interpreted the same BREAST test-set consisting of sixty de-identified mammographic examinations arising from an Australian population. Performance parameters including JAFROC, ROC, case sensitivity as well as specificity were compared between Australian and Singaporean readers using a Mann Whitney U test. Results: A significant difference (P=0.036) was demonstrated between the JAFROC scores of the Australian and Singaporean breast radiologists. No other significant differences were observed. Conclusion: JAFROC scores for Australian radiologists were higher than those obtained by the Singaporean counterparts. Whilst it is tempting to suggest this is down to reader expertise, this may be a simplistic explanation considering the very similar training and audit backgrounds of the two populations of radiologists. The influence of reading images that are different from those that radiologists normally encounter cannot be ruled out and requires further investigation, particularly in the light of increasing international outsourcing of radiologic reporting.
This study aims to investigate the effectiveness of the single cranio-caudal (CC) mammogram in comparison with traditional two projection mammography for breast cancer detection. Sixteen radiologists were invited to report 60 two-projection (MLO and CC) mammograms of the left and right breasts of which 20 cases contained cancer. Participants searched for the presence of breast lesion(s) on each view and provided a confidence score. Sensitivity, lesion sensitivity and specificity were compared between the CC projection versus the two projection approach among different groups of readers. Results showed that expert readers needed only single CC mammogram in their reading while non-expert readers required two-projection mammography.
Rationale and Objectives: This study will investigate the link between radiologists’ experience in reporting mammograms, their caseloads and the decision to give a classification of Royal Australian and New Zealand College of Radiologists (RANZCR) category ‘3’ (indeterminate or equivocal finding). Methods: A test set of 60 mammograms comprising of 20 abnormal and 40 normal cases were shown to 92 radiologists. Each radiologist was asked to identify and localize abnormalities and provide a RANZCR assessment category. Details were obtained from each reader regarding their experience, qualifications and breast reading activities. ‘Equivocal fractions’ were calculated by dividing the number of ‘equivocal findings’ given by each radiologist in the abnormal and normal cases by the total number of cases analyzed: 20 and 40 respectively. The ‘equivocal fractions’ for each of the groups (normal vs abnormal) were calculated and independently correlated with age, number of years since qualification as a radiologist, number of years reading mammograms, number of mammograms read per year, number of hours reading mammograms per week and number of mammograms read over lifetime (the number of years reading mammograms multiplied by the number of mammograms read per year). The non-parametric Spearman test was used. Results: Statistically negative correlations were noted between ‘equivocal fractions’ for the following groups: • For abnormal cases: hours per week (r= -0.38 P= 0.0001) • For normal cases: total number of mammograms read per year (r= -0.29, P= 0.006); number of mammograms read over lifetime (r= -0.21, P= 0.049)); hours reading mammograms per week (r= - 0.20, P= 0.05). Conclusion: Radiologists with greater reading experience assign fewer RANZCR category 3 or equivocal classifications. The findings have implications for screening program efficacy and recall rates. This work is still in progress and further data will be presented at the conference.
Rationale and Objectives: To identify parameters linked to higher levels of performance in screening mammography. In particular we explored whether experience in reading digital cases enhances radiologists’ performance.
Methods: A total of 60 cases were presented to the readers, of which 20 contained cancers and 40 showed no abnormality. Each case comprised of four images and 129 breast readers participated in the study. Each reader was asked to identify and locate any malignancies using a 1-5 confidence scale. All images were displayed using 5MP monitors, supported by radiology workstations with full image manipulation capabilities. A jack-knife free-response receiver operating characteristic, figure of merit (JAFROC, FOM) methodology was employed to assess reader performance. Details were obtained from each reader regarding their experience, qualifications and breast reading activities. Spearman and Mann Whitney U techniques were used for statistical analysis.
Results: Higher performance was positively related to numbers of years professionally qualified (r= 0.18; P<0.05), number of years reading breast images (r= 0.24; P<0.01), number of mammography images read per year (r= 0.28; P<0.001) and number of hours reading mammographic images per week (r= 0.19; P<0.04). Unexpectedly, higher performance was inversely linked to previous experience with digital images (r= - 0.17; p<0.05) and further analysis, demonstrated that this finding was due to changes in specificity.
Conclusion: This study suggests suggestion that readers with experience in digital images reporting may exhibit a reduced ability to correctly identify normal appearances requires further investigation. Higher performance is linked to number of cases read per year.
Aim: To examine the relationship between sensitivity measured from the BREAST test-set and clinical performance.
Background: Although the UK and Australia national breast screening programs have regarded PERFORMS and BREAST test-set strategies as possible methods of estimating readers' clinical efficacy, the relationship between test-set and real life performance results has never been satisfactorily understood.
Methods: Forty-one radiologists from BreastScreen New South Wales participated in this study. Each reader interpreted a BREAST test-set which comprised sixty de-identified mammographic examinations sourced from the BreastScreen Digital Imaging Library. Spearman's rank correlation coefficient was used to compare the sensitivity measured from the BREAST test-set with screen readers' clinical audit data.
Results: Results shown statistically significant positive moderate correlations between test-set sensitivity and each of the following metrics: rate of invasive cancer per 10 000 reads (r=0.495; p < 0.01); rate of small invasive cancer per 10 000 reads (r=0.546; p < 0.001); detection rate of all invasive cancers and DCIS per 10 000 reads (r=0.444; p < 0.01).
Conclusion: Comparison between sensitivity measured from the BREAST test-set and real life detection rate demonstrated statistically significant positive moderate correlations which validated that such test-set strategies can reflect readers' clinical performance and be used as a quality assurance tool. The strength of correlation demonstrated in this study was higher than previously found by others.
High quality breast imaging and accurate image assessment are critical to the early diagnoses, treatment and management of women with breast cancer. Breast Screen Reader Assessment Strategy (BREAST) provides a platform, accessible by researchers and clinicians world-wide, which will contain image data bases, algorithms to assess reader performance and on-line systems for image evaluation. The platform will contribute to the diagnostic efficacy of breast imaging in Australia and beyond on two fronts: reducing errors in mammography, and transforming our assessment of novel technologies and techniques. Mammography is the primary diagnostic tool for detecting breast cancer with over 800,000 women X-rayed each year in Australia, however, it fails to detect 30% of breast cancers with a number of missed cancers being visible on the image [1-6]. BREAST will monitor the mistakes, identify reasons for mammographic errors, and facilitate innovative solutions to reduce error rates. The BREAST platform has the potential to enable expert assessment of breast imaging innovations, anywhere in the world where experts or innovations are located. Currently, innovations are often being assessed by limited numbers of individuals who happen to be geographically located close to the innovation, resulting in equivocal studies with low statistical power. BREAST will transform this current paradigm by enabling large numbers of experts to assess any new method or technology using our embedded evaluation methods. We are confident that this world-first system will play an important part in the future efficacy of breast imaging.