PurposePopulation-based screening programs for the early detection of breast cancer have significantly reduced mortality in women, but they are resource intensive in terms of time, cost, and workload and still have limitations mainly due to the use of 2D imaging techniques, which may cause overlapping of tissues, and interobserver variability. Artificial intelligence (AI) systems may be a valuable tool to assist radiologist when reading and classifying mammograms based on the malignancy of the detected lesions. However, there are several factors that can influence the outcome of a mammogram and thus also the detection capability of an AI system. The aim of our work is to analyze the robustness of the diagnostic ability of an AI system designed for breast cancer detection.ApproachMammograms from a population-based screening program were scored with the AI system. The sensitivity and specificity by means of the area under the receiver operating characteristic (ROC) curve were obtained as a function of the mammography unit manufacturer, demographic characteristics, and several factors that may affect the image quality (age, breast thickness and density, compression applied, beam quality, and delivered dose).ResultsThe area under the curve (AUC) from the scoring ROC curve was 0.92 (95% confidence interval = 0.89 − 0.95). It showed no dependence with any of the parameters considered, as the differences in the AUC for different interval values were not statistically significant.ConclusionThe results suggest that the AI system analyzed in our work has a robust diagnostic capability, and that its accuracy is independent of the studied parameters.
Screening programs for the early detection of breast cancer have significantly reduced mortality in women. The limitations of these programmes are primarily due to the use of 2D techniques and the high number of mammograms to be read by radiologists. Artificial Intelligence (AI) systems may lead to new tools to help radiologists read mammograms and classify the examination based on the malignancy of the detected lesions. Several factors related to breast characteristics (thickness and density), technical factors of image acquisition, X-ray system performance and image processing algorithms can influence the outcome of a mammogram and thus also the detection capability of an AI system. The aim of this work is to analyze the robustness of an AI system for breast cancer detection and its dependence on breast characteristics and technical factors. For this purpose, mammograms from a population-based screening program were scored with the AI system. The AUC (area under the ROC curve) index generated from the scoring ROC curve was 0.92 (CI(95%) = 0.89 - 0.95), demonstrating the robust performance of the AI system. Moreover, the statistical analysis performed showed that the AUC index was independent of breast characteristics, the type of mammographic system and most of the technical parameters considered, demonstrating the effectiveness of the AI system.
KEYWORDS: Digital breast tomosynthesis, Data modeling, Mammography, Breast, Computer aided diagnosis and therapy, Detection and tracking algorithms, Radiology, Image processing, Data conversion, Machine learning
The paper presents a framework for the detection of mass-like lesions in 3D digital breast tomosynthesis. It consists of several steps, including pre and post-processing, and a main detection block based on a Faster RCNN deep learning network. In addition to the framework, the paper describes different training steps to achieve better performance, including transfer learning using both mammographic and DBT data. The presented approach obtained third place in the recent DBT Lesion detection Challenge, DBTex, being the top approach without using an ensemble based method.
In this study we used a large previously built database of 2,892 mammograms and 31,650 single mammogram radiologists’ assessments to simulate the impact of replacing one radiologist by an AI system in a double reading setting. The double human reading scenario and the double hybrid reading scenario (second reader replaced by an AI system) were simulated via bootstrapping using different combinations of mammograms and radiologists from the database. The main outcomes of each scenario were sensitivity, specificity and workload (number of necessary readings). The results showed that when using AI as a second reader, workload can be reduced by 44%, sensitivity remains similar (difference -0.1%; 95% CI = - 4.1%, 3.9%), and specificity increases by 5.3% (P<0.001). Our results suggest that using AI as a second reader in a double reading setting as in screening programs could be a strategy to reduce workload and false positive recalls without affecting sensitivity.
The physical performance metrics of computed-radiology (CR) systems used in screening mammography are lower than those of digital-radiology (DR). Also, the lack of quality assurance procedures in some countries might have a technology-dependent impact on image quality and dose. The Mexican Secretary of Health owns over 300 mammography units for breast cancer screening, about half of them of CR technology. We´ve investigated the performance of 20 CR and 4 DR units in 13 Mexican States, applying over 30 quality-control tests associated with general equipment performance, X-ray source, automatic exposure control, mean glandular dose (MGD), image receptor and image quality, and display conditions. Tests were applied following international protocols and their compliance criteria. None of the systems passed all the significative tests. For CRs, the worst performance was observed in compensation for breast thickness, signal-to-noise ratio (SNR) homogeneity, CDMAM thresholdthickness, sensitivity matching of CR plates, and the presence of artefacts. The worse performance of DRs was in compression force, SNR homogeneity, and artefacts. MGD values agreed with recommendations for 2-7 cm PMMA thickness in 50% of CRs and 75% of DRs. The dominance of quantum noise over other components was evaluated by 4 criteria endorsed by different organizations, and results depended on the applied criterium. Analysis of the maintenance procedures suggested that one explanation for these poor results might be the complex CR technology, where the x-ray generation is controlled by a unit fabricated by one manufacturer and the image generation occurs in a non-integrated unit from a different manufacturer.
In this work, images of exams performed in Digital Breast Tomosynthesis (DBT) system were collected retrospectively from 660 Brazilian women, who underwent screening mammography in clinics located in three Brazilian geographic regions. The raw images were processed using the Volpara software, through which the volume and volumetric density of the breast, the contact area between the breast and the tray, the force and pressure of compression and the thickness of the compressed breast were determined. mean breast size was determined by Volpara, resulting in 737 cm3 . The compression force had a median of 79.5 N, with a range of 20 to 160 N and a compression pressure of 9.94 kPa, with a range of 2.6 to 29.5 kPa. The analysis of the correlation between the quantities resulted in percentage dense volume and volume with compression force, r = - 0.259 and r = 0.313, respectively (p < 0.01)), percentage dense volume and volume with compression pressure, r = 0.327 and r = -0.478 (p < 0.01), showing that when it is considered an intrinsic characteristic of the breast there is a greater possibility of standardizing compression through compression pressure instead of compression force.
KEYWORDS: Breast, Digital breast tomosynthesis, Polymethylmethacrylate, Mammography, X-rays, Digital mammography, Breast cancer, Sensors, Imaging systems, Tissues
The objective of this work is to present the results for quality control tests applied to projection images acquisition in digital mammography and breast tomosynthesis (DBT). Mean glandular doses (MGD) were measured for the examination of series of women and for breast-simulating polymethyl methacrylate phantoms, thus assessing the suitability of the phantoms used for dosimetry in 2D mammography for DBT dosimetry. Moreover, X-ray tube output and half value layer measurements for MGD estimation using phantoms are also presented. Three different mammography/DBT systems were considered in this work: Hologic Selenia Dimensions, General Electric Senoclaire and Pristina and Siemens Inspiration. The results obtained for the different projections were compared with the 2D acquisitions and the differences between the two image modalities were compared.
KEYWORDS: Breast, Mammography, Digital breast tomosynthesis, Image compression, Breast cancer, Tissues, Image analysis, X-rays, Statistical analysis, Digital mammography
This study aims to verify the relationship of MGD between four different types of manufacturing mammograms and models and to verify patient characteristic factors and GDM. Using the Volpara software were analyzed a total of 7,000 3D and 2D images. From this analysis were obtained the breast volume density (DVB) and the MGD. Using the DICOM header of the image, we collected the patient's age and compressed breast thickness. The sample of patients presented a mean of 57 (±15) mm of compressed breast thickness(CBT) for the Hologic equipment (range from 19.82 to 100.75 mm) and the medians for the other variables were 51 years (range 25 to 87 years old), 1.75 mGy MGD (0.43 to 4.68 mGy range), and 7.61% DVB (2.16% to 36.89% range). The MGD for GE Senoclaire system and Hologic were higher compared the other evaluated tomosynthesis systems as also higher for MLO projection when compared to CC projection. The Siemens equipment was the system that gave the lowest dose in all breast thicknesses evaluated.
KEYWORDS: Breast, Biopsy, Digital breast tomosynthesis, Monte Carlo methods, Biological research, X-rays, Sensors, Target acquisition, X-ray detectors, Dielectrophoresis
Stereotactic breast biopsy (SBB) is a common clinical procedure for suspicious breast lesion analysis. With the arrival of DBT-guided biopsy systems, the clinical performance of such procedures has improved enormously since breast lesions are better detected. However, little information is found in the literature regarding the patient’s radiation dose during these clinical procedures. This work presents, for the first time, a first approach to estimate the mean glandular dose (MGD) within the biopsy window for 101 patients who underwent breast biopsy in a commercially available DBT-guided prone table. This study is supported by the calculation of normalised glandular dose (DgN) coefficients from Monte Carlo simulations. Preliminary results show that the total MGD of the biopsy procedure varies between 10.2 mGy and 19.2 mGy for patients with breast thickness between 2 cm and 8 cm. Furthermore, a great variability in the number of acquisitions (tomo scan or stereo projections) of the biopsy procedure was observed. For the investigated system, MGD for DBT-guided breast biopsies are, for 5-6 cm thick breasts, around 23% lower than MGD observed in stereo biopsy procedures. The proposed method represents a first approach towards a full dose estimation of DBTguided breast biopsy procedures.
Geometric distortion is the inaccurate representation of the size or shape of a structure in the radiographic image. Exaggerated distortion makes radiography unacceptable for diagnosis. A new algorithm that was developed by us provides data on geometric distortion (GD) and ghost artifact-distortion (GAD) of digital breast tomosynthesis (DBT) images. This algorithm is similar to the one developed by the National Coordinating Centre for the Physics of Mammography (NCCPM), with the advantage of allowing the user to select the best-fit region of interest (ROI). The selection ensures that no information about the artifact dispersion contained in a ROI is lost. The aim of this study was to evaluate the dependence of ROI dimension (width and height) on the GD and GDA evaluation in digital breast tomosynthesis images using the new algorithm and to compare the results obtained with the limit values of reference, based on routine quality control tests for breast tomosynthesis. For the analyzes, the images were initially acquired with a 5 mm thick rectangular phantom composed of polymethyl methacrylate (PMMA) containing 1 mm diameter aluminum spheres. The phantom was inserted in the 60 mm thick PMMA phantom, positioned 25 mm away from the compression tray. The height of in-focus plane, the accuracy of positioning in the focus plane, and the appearance of aluminum spheres in the adjacent in-focus planes were analyzed for different ROI dimensions.
In this study we analyze the impact of new x-ray beam spectra on the mean glandular doses (MGD) delivered by a digital breast tomosynthesis system. The new polyenergetic spectra are generated with a rhodium (Rh) target and a 30 μm silver (Ag) filter. To evaluate the influence of the new spectra on patient doses, we compare the MGD values with those delivered with a regular Rh/Rh target/filter combination. Individual glandularity (%) of the patients in the study was estimated using the commercial software Volpara. Median of MGD values for CC and MLO views are around 38% and 46% lower with the Rh/Ag combination than with the Rh/Rh combination. Results suggest that the new spectra, with reduced dose properties, could be very useful in breast cancer screening programs.
The 2D synthetic image (SM) generated from digital breast tomosynthesis (DBT) has the potential to replace conventional digital mammography (DM), therefore reducing patient dose without affecting the cancer detection performance. In this work, we analysed the image quality of SMs from three different manufacturers for the specific task of detecting microcalcifications (MC), in comparison to DM. A phantom with MC clusters on a uniform background was employed, thus also allowing to explore its feasibility to be used for quality control (QC). A 4-Alternative Forced Choice (4AFC) experiment was performed by four human observers, for detection of MC clusters on a region-of-interest level. We also explored the possibility to replace human observers with a virtual observer. For this, we developed a deep learning convolutional neural network (CNN) for the task of classifying the same images from the 4AFC study, and then compare the results to the human-based study. The results showed that for the four readers and all the systems, the percentage of correct answers (PC) was 100% and the visibility was 3 for the largest MC clusters. However, SM yielded worse detectability than DM for MC with sizes between 180 and 100 μm (PC was around 18% inferior in average). The CNN yielded the same relative results across modalities and systems than the 4AFC study, but in terms of the area under the receiver operating characteristic curve. This might encourage the possibility to develop QC procedures based on artificial intelligence image reading, improving reproducibility and reducing costs.
KEYWORDS: Image segmentation, Digital breast tomosynthesis, Breast, Mammography, Systems modeling, Detection and tracking algorithms, Tissues, Convolutional neural networks, 3D modeling, Image processing algorithms and systems
Digital breast tomosynthesis (DBT) has superior detection performance than mammography (DM) for population-based breast cancer screening, but the higher number of images that must be reviewed poses a challenge for its implementation. This may be ameliorated by creating a twodimensional synthetic mammographic image (SM) from the DBT volume, containing the most relevant information. When creating a SM, it is of utmost importance to have an accurate lesion localization detection algorithm, while segmenting fibroglandular tissue could also be beneficial. These tasks encounter an extra challenge when working with images in the medio-lateral oblique view, due to the presence of the pectoral muscle, which has similar radiographic density. In this work, we present an automatic pectoral muscle segmentation model based on a u-net deep learning architecture, trained with 136 DBT images acquired with a single system (different BIRADS ® densities and pathological findings). The model was tested on 36 DBT images from that same system resulting in a dice similarity coefficient (DSC) of 0.977 (0.967-0.984). In addition, the model was tested on 125 images from two different systems and three different modalities (DBT, SM, DM), obtaining DSCs between 0.947 and 0.970, a range determined visually to provide adequate segmentations. For reference, a resident radiologist independently annotated a mix of 25 cases obtaining a DSC of 0.971. The results suggest the possibility of using this model for inter-manufacturer DBT, DM and SM tasks that benefit from the segmentation of the pectoral muscle, such as SM generation, computer aided detection systems, or patient dosimetry algorithms.
KEYWORDS: Performance modeling, Mammography, Visual system, Visual process modeling, Algorithm development, Visualization, Data acquisition, Image analysis, Medical imaging, Signal attenuation
A software tool is presented to merge CDMAM phantom images with real mammographic backgrounds. It allows SKE
tasks in uniform and in real backgrounds. This kind of tasks can be used to compare human, human visual metric or
model observer performance in detail detection using uniform or mammographic backgrounds.
As it is very well known, local characteristics of the structures in real mammographic backgrounds reduce the human
performance in contrast-detail detection tasks. In consequence that performance cannot be inferred from the data
acquired in white noise (flat) backgrounds such as a CDMAM phantom produces.
It is of interest to compare the response of a mammography system to the same set of signals, either embedded in flat or
in real backgrounds. This comparison achieves two goals. The first one is to analyze the variation of the recognition
threshold of the system for both backgrounds. The second one is to analyze the performance of a human observer or a
model observer over the same set of signals, varying the nature of the backgrounds.
The software tool presented here uses CDMAM images to merge with a region of interest selected from a real
mammography. This region as well as the mixing image method (basically adding or multiplying pixels) can be freely
selected by the user. In this work a set of measurements of 8 images has been analyzed. We can preview the variation of
the contrast-detail detection for a human observer and a human visual system metric (R*).
A software tool is presented to measure the geometric distortion in images obtained with X-ray systems that provides a
more objective method than the usual measurements over the image of a phantom with usual rulers. In a first step, this
software has been applied to mammography images and makes use of the grid included into the CDMAM phantom
(University Hospital Nijmegen).
For digital images, this software tool automatically locates the grid crossing points and obtains a set of corners (up to
237) that are used by the program to determine 6 different squares, at top, bottom, left, right and central positions. The
sixth square is the largest that can be fitted in the grid (widest possible square). The distortion is calculated as ((length of
left diagonal - length of right diagonal)/ length of left diagonal) (%) for the six positions. The algorithm error is of the
order of 0.3%. The method might be applied to other radiological systems without any major changes to adjust the
program code to other phantoms.
In this work a set of measurements for 54 CDMAM images, acquired in 11 different mammography systems from 6
manufacturers are presented. We can conclude that the distortion of all equipments is smaller than the recommendations
for maximum distortions in primary displays (2%)
The performance of 37 primary class liquid crystal display devices (2, 3 and 5 Mpixel matrix size) used in 9 different
diagnostic services in Spain has been determined in terms of 13 quantitative and visual evaluations. The equipment had
never been subjected to calibration or to QC tests since commissioning by vendors, between 2 and 18 months before
measurements. Tests, using calibrated luminance meters and TG18 patterns, have evaluated ambient light conditions and
other basic performance indicators, namely, display geometric distortion, artefacts, resolution and low-contrast visibility,
contrast luminance response compliance to DICOM standard, luminance extreme values and uniformity between pairs of
monitors associated to a same workstation. The principal sources of non-compliance are failures to visualize low-contrast
test objects (73% of displays), excessive differences with the DICOM contrast response standard (57%), and non-uniform
response of monitor pairs (54%). Also, 43% of LCD were found located in places with excessive illumination
and presenting specular reflections from faceplates. The analysis of ten 5 Mpixel displays, of possible use in
mammography services, indicates similar performance as the rest of monitors, except for the ambient luminance (100%
complying with recommendations) and larger non-compliance with the DICOM response standard (80%). No correlation
between image quality indicators and monitor hours of operation was found.
As part of an EC funded project, the design for a new phantom has been proposed that consists of a smaller contrast-detail part than the CDMAM phantom and that contains items for other parts of an acceptance protocol for digital mammography. A first prototype of the "DIGIMAM" has been produced. Both the CDMAM phantom and the DIGIMAM phantom were then used on a series of systems and read out as a part of a multi centre study.
The results with the new phantom were very similar to results obtained with the CDMAM phantom: readers scored different from each other and there was an overlap in the scores for the different systems. A system with a poor score in CDMAM had also the worst score for DIGIMAM. Reading time was significantly reduced however. There was promising agreement between automated reading of CDMAM and the scores of the DIGIMAM phantom. In order to reduce the subjectivity of the readings, computerized reading of the DIGIMAM should be developed. In a second version of the phantom, we propose to add more disks of the same size and contrast in each square to improve the statistical power of each reading.
The presence of degraded or soft wedges on a scene or digital image may cause ambiguities in depth perception. This kind of edge has been analyzed by introducing a mathematical model to digitally implement a large category of degraded edges. This model has accounted the origin of degradation, the performance of psychophysical test for measuring the degradation perception, and image characteristics currently used in artificial vision as momenta, number of gray scale level per pixels, etc. It is the subject of current researches to be applied to artificial reproduction of human visual perception.
Resolution criteria for resolving two line images under coherent and incoherent illumination are presented. This technique has been used to interpret psychophysical data obtained from hyperacuity paradigms.
We define a resolution criterion for the case of two narrow line images. For doing so, we have considered the two
lines as superposition of gradients of two Heaviside step-functions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.