PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE
Proceedings Volume 6515, including the Title Page, Copyright
information, Table of Contents, Introduction, and the
Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Binary ROC analysis has solid decision-theoretic foundations and a close relationship to linear
discriminant analysis (LDA). In particular, for the case of Gaussian equal covariance input data, the area under the ROC
curve (AUC) value has a direct relationship to the Hotelling trace. Many attempts have been made to extend binary
classification methods to multi-class. For example, Fukunaga extended binary LDA to obtain multi-class LDA, which
uses the multi-class Hotelling trace as a figure-of-merit, and we have previously developed a three-class ROC analysis
method. This work explores the relationship between conventional multi-class LDA and three-class ROC analysis. First,
we developed a linear observer, the three-class Hotelling observer (3-HO). For Gaussian equal covariance data, the 3-
HO provides equivalent performance to the three-class ideal observer and, under less strict conditions, maximizes the
signal to noise ratio for classification of all pairs of the three classes simultaneously. The 3-HO templates are not the
eigenvectors obtained from multi-class LDA. Second, we show that the three-class Hotelling trace, which is the figureof-
merit in the conventional three-class extension of LDA, has significant limitations. Third, we demonstrate that, under
certain conditions, there is a linear relationship between the eigenvectors obtained from multi-class LDA and 3-HO
templates. We conclude that the 3-HO based on decision theory has advantages both in its decision theoretic background
and in the usefulness of its figure-of-merit. Additionally, there exists the possibility of interpreting the two linear features
extracted by the conventional extension of LDA from a decision theoretic point of view.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We have shown previously that an obvious generalization of the area under an ROC curve (AUC) cannot serve
as a useful performance metric in classification tasks with more than two classes. We define a new performance
metric, grounded in the concept of expected utility familiar from ideal observer decision theory, but which
should not suffer from the issues of dimensionality and degeneracy inherent in the hypervolume under the ROC
hypersurface in tasks with more than two classes. In the present work, we compare this performance metric
with the traditional AUC metric in a variety of two-class tasks. Our numerical studies suggest that the behavior
of the proposed performance metric is consistent with that of the AUC performance metric in a wide range of
two-class classification tasks, while analytical investigation of three-class "near-guessing" observers supports our
claim that the proposed performance metric is well-defined and positive in the limit as the observer's performance
approaches that of the guessing observer.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The LROC curve may be generalized in two ways. We can replace the location of the signal with an arbitrary
set of parameters that we wish to estimate. We can also replace the binary function that determines whether
an estimate is correct by a utility function that measures the usefulness of a particular estimate given the true
parameter set. The expected utility for the true-positive detections may then be plotted versus the false-positive
fraction as the detection threshold is varied to generate an estimation ROC curve (EROC). Suppose we run a
2AFC study where the observer must decide which image has the signal and then estimate the parameter set.
Then the average value of the utility on those image pairs where the observer chooses the correct image is an
estimate of the area under the EROC curve (AEROC). The ideal LROC observer may also be generalized to the
ideal EROC observer, whose EROC curve lies above those of all other observers. When the utility function is
non-negative the ideal EROC observer shares many properties with the ideal ROC observer, which can simplify
the calculation of the ideal AEROC. When the utility function is concave the ideal EROC observer makes use
of the posterior mean estimator. Other estimators that arise as special cases include maximum a posteriori
estimators and maximum likelihood estimators. Multiple signals may be accomodated in this framework by
making the number of signals one of the parameters in the set to be estimated.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Validating the use of new imaging technologies for screening large patient populations is an important and very
challenging area of diagnostic imaging research. A particular concern in ROC studies evaluating screening technologies
is the problem of verification bias, in which an independent verification of disease status is only available for a
subpopulation of patients, typically those with positive results by a current screening standard. For example, in
screening mammography, a study might evaluate a new approach using a sample of patients that have undergone needle
biopsy following a standard mammogram and subsequent work-up. This case sampling approach provides accurate
independent verification of ground truth and increases the prevalence of disease cases. However, the selection criteria
will likely bias results of the study. In this work we present an initial exploration of an approach to correcting this bias
within the parametric framework of binormal assumptions. We posit conditionally bivariate normal distributions on the
latent decision variable for both the new methodology as well as the screening standard. In this case, verification bias
can be seen as the effect of missing data from an operating point in the screening standard. We examine the magnitude
of this bias in the setting of breast cancer screening with mammography, and we derive a maximum likelihood approach
to estimating bias corrected ROC curves in this model.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
There are at least two sources of variance when estimating the performance of an imaging device: the doctors
(readers) and the patients (cases). These sources of variability generate variances and covariances in the observer
study data that can be addressed with multi-reader, multi-case (MRMC) variance analysis. Frequently, a fully-crossed
study design is used to collect the data; every reader reads every case. For imaging devices used during
in vivo procedures, however, a fully-crossed design is infeasible. Instead, each patient is diagnosed by only one
doctor, a doctor-patient study design. Here we investigate percent correct (PC) under this doctor-patient study
design. From a probabilistic foundation, we present the bias and variance of two statistics: pooled PC and
reader-averaged PC. We also present variance estimates of these statistics and compare them to naive estimates.
Finally, we run simulations to assess the statistics and the variance estimates. The two PC statistics have the
same means but different variances. The variances depend on how patients are distributed among the readers
and the amount of reader variability. Regarding the variance estimates, the MRMC estimates are unbiased,
whereas the naive estimates bracket the true variance and can be extremely biased.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a methodology for evaluating whether the use of CAD is effective for any given reader or
case, first analyzing the results of readers' judgments (0 or 1) by the technique known as analysis of bias-variance
characteristics (BVC)1,2, then by combining this with ROC analysis, elucidating the internal structure of the ROC curve.
The mean and variance are first calculated for the situation when multiple readers examine a medical image for a single
case without CAD and with CAD, and assign the values 0 and 1 to their judgment of whether abnormal findings are
absent or present or whether the case is normal or abnormal. The mean of these values represents the degree of bias
from the true diagnosis for the particular case, and the variance represents the spread of judgments between readers.
When the relationship between the two parameters is examined for several cases with differing degrees of diagnostic
difficulty, the mean (horizontal axis) and variance (vertical axis) show a bell-shaped relation. We have named this
typical phenomenon arising when images are read, the bias-variance characteristic (BVC) of diagnosis. The mean of the
0 and 1 judgments of multiple readers is regarded as a measure of the confidence level determined for the particular
case. ROC curves were drawn by usual methods for diagnoses made without CAD and with CAD. From the difference
between the TPF obtained without CAD and with CAD for the same FPF on the ROC curve, we were able to quantify
the number of cases, the total number of readers, and the total number of cases for which CAD support was beneficial.
To demonstrate its usefulness, we applied this method to data obtained in a reading experiment that aimed to evaluate
detection performance for abnormal findings and data obtained in a reading experiment that aimed to evaluate
diagnostic discrimination performance for normal and abnormal cases. We analyzed the internal structure of the ROC
curve produced when all cases were included, and showed that there is a relationship between the degree of diagnostic
difficulty of the case and the benefit of CAD support and demonstrated that there are patients and readers for whom
CAD is of benefit and those for whom it is not.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Harold Kundel Honorary Lecture and Image Perception
Visual perception is such an intimate part of human experience that we assume that it is entirely
accurate. Yet, perception accounts for about half of the errors made by radiologists using adequate imaging
technology. The true incidence of errors that directly affect patient well being is not known but it is probably
at the lower end of the reported values of 3 to 25%. Errors in screening for lung and breast cancer are
somewhat better characterized than errors in routine diagnosis. About 25% of cancers actually recorded on
the images are missed and cancer is falsely reported in about 5% of normal people.
Radiologists must strive to decrease error not only because of the potential impact on patient care but
also because substantial variation among observers undermines confidence in the reliability of imaging
diagnosis. Observer variation also has a major impact on technology evaluation because the variation
between observers is frequently greater than the difference in the technologies being evaluated. This has
become particularly important in the evaluation of computer aided diagnosis (CAD).
Understanding the basic principles that govern the perception of medical images can provide a
rational basis for making recommendations for minimizing perceptual error. It is convenient to organize
thinking about perceptual error into five steps. 1) The initial acquisition of the image by the eye-brain
(contrast and detail perception). 2) The organization of the retinal image into logical components to produce
a literal perception (bottom-up, global, holistic). 3) Conversion of the literal perception into a preferred
perception by resolving ambiguities in the literal perception (top-down, simulation, synthesis). 4) Selective
visual scanning to acquire details that update the preferred perception. 5) Apply decision criteria to the
preferred perception.
The five steps are illustrated with examples from radiology with suggestions for minimizing error.
The role of perceptual learning in the development of expertise is also considered.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Mammography screening is the most widely utilized tool to screen for breast cancer.
Radiologists read a mammogram using a two-pass strategy where the first pass is guided by salient
features of the image (the so-called 'pop-out' elements), and the second pass is a systematic search.
It is assumed that most breast masses that are reported by the radiologist are in fact detected during
the first pass of this search strategy, and that the second pass is useful for the detection of
microcalcification clusters. Furthermore, experiments in other visual domains have shown that
observers are attracted faster to incongruous elements in a display than to normal (i.e., more
expected) elements. In this sense, it can be argued that benign findings constitute more expected
findings, because they encompass a large percentage of all abnormalities found on a mammogram.
In this experiment we sought to determine whether the search for malignant masses was indeed
faster than the search for benign masses. We also aimed to determine whether the observers' overall
visual search behavior was different between benign and malignant cases, not only in terms of how
long it took the observers to hit the location of the lesion, but also how long the observers took
analyzing the case, how different the distribution of false positive responses were between the two
types of cases, etc.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The goal of this research was to evaluate two different stack mode layouts for 3D
medical images - a regular stack mode layout where just the topmost image was visible,
and a new stack mode layout, which included the images just before and after the main
image. We developed stripped down user interfaces to test the techniques, and designed a
look-alike radiology task using 3D artificial target stimuli implanted in the slices of
medical image volumes. The task required searching for targets and identifying the range
of slices containing the targets.
Eight naive students participated, using a within-subjects design. We measured the
response time and accuracy of subjects using the two layouts and tracked the eyegaze of
several subjects while they performed the task. Eyegaze data was divided into fixations
and saccades
Subjects were 19% slower with the new stack layout than the standard stack layout,
but 5 of the 8 subjects preferred the new layout. Analysis of the eyegaze data showed that
in the new technique, the context images on both sides were fixated once the target was
found in the topmost image. We believe that the extra time was caused by the difficulty in controlling the rate of scrolling, causing overshooting. We surmise that providing some contextual detail such as adjacent slices in the new stack mode layout is helpful to reduce cognitive load for this radiology look-alike task.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Oliver Burgert, Veronika Örn, Boris M. Velichkovsky, Michael Gessat, Markus Joos, Gero Strauß M.D., Christian Tietjen, Bernhard Preim, Ilka Hertel M.D.
Proceedings Volume Medical Imaging 2007: Image Perception, Observer Performance, and Technology Assessment, 65150B (2007) https://doi.org/10.1117/12.709631
Neck dissection is a surgical intervention at which cervical lymph node metastases are removed. Accurate surgical planning is of high importance because wrong judgment of the situation causes severe harm for the patient. Diagnostic perception of radiological images by a surgeon is an acquired skill that can be enhanced by training and experience. To improve accuracy in detecting pathological lymph nodes by newcomers and less experienced professionals, it is essential to understand how surgical experts solve relevant visual and recognition tasks. By using eye tracking and especially the newly-developed attention landscapes visualizations, it could be determined whether visualization options, for example 3D models instead of CT data, help in increasing accuracy and speed of neck dissection planning. Thirteen ORL surgeons with different levels of expertise participated in this study. They inspected different visualizations of 3D models and original CT datasets of patients. Among others, we used scanpath analysis and attention landscapes to interpret the inspection strategies. It was possible to distinguish different patterns of visual exploratory activity. The experienced surgeons exhibited a higher concentration of attention on the limited number of areas of interest and demonstrated less saccadic eye movements indicating a better orientation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Eye position monitoring has been used for decades in Radiology in order to determine how
radiologists interpret medical images. Using these devices several discoveries about the
perception/decision making process have been made, such as the importance of comparisons of
perceived abnormalities with selected areas of the background, the likelihood that a true lesion will
attract visual attention early in the reading process, and the finding that most misses attract
prolonged visual dwell, often comparable to dwell in the location of reported lesions. However, eye
position tracking is a cumbersome process, which often requires the observer to wear a helmet gear
which contains the eye tracker per se and a magnetic head tracker, which allows for the computation
of head position. Observers tend to complain of fatigue after wearing the gear for a prolonged time.
Recently, with the advances made to remote eye-tracking, the use of head-mounted systems seemed
destined to become a thing of the past. In this study we evaluated a remote eye tracking system, and
compared it to a head-mounted system, as radiologists read a case set of one-view mammograms on
a high-resolution display. We compared visual search parameters between the two systems, such as
time to hit the location of the lesion for the first time, amount of dwell time in the location of the
lesion, total time analyzing the image, etc. We also evaluated the observers' impressions of both
systems, and what their perceptions were of the restrictions of each system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Contrast-detail analysis is one the most common way for the assessment of the performance of an imaging system. Usually, the reading of phantoms, such as CDMAM, is obtained by human observers. The main drawbacks of this practice is the presence of inter-observer variability and the great amount of time needed. However, software programs are available, for reading CDMAM images in an automatic way. In this paper we present a comparison of human and software reading of CDMAM images coming from three different FFDM clinical units. Images were acquired at different exposures in the same conditions for the three systems. Once software has completed the reading, the interpretation of the results is achieved on the same way used for the human case. CDCOM results are consistent with human analysis, if we consider figures such as COR and IQF. On the other hand, we find out some discrepancies along the CD curves obtained by human observers, with respect to those estimated by automated CDCOM analysis.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
PERFORMS (Personal Performance in Mammographic Screening), a self-assessment scheme for film-readers is
undertaken as an educational tool by mammographers reading breast-screening films in the UK. The scheme has been
running as a bi-annual exercise since its inception in 1991. In addition to completing the scheme each year the majority
of film-readers also choose to complete a questionnaire, administered as part of the scheme, indicating key aspects of
their every-day reading practice. These key aspects include, volume of cases read per week, time-on-task reading
screening films, incidence and time of break periods as well as typical number of film-reading sessions per week.
Previous recommendations on best screening practice (significantly the optimum time on task) were considered in the
light of these film-readers' self-reports on a current PERFORMS case set.
In addition we looked at performance accuracy of over 450 film-readers reading PERFORMS cases (60 difficult
mammographic cases). Performance on measures akin to True Positive (Correct Recall Percentages) and True Negative
(Correct Return to Screen Percentages) decisions were investigated. Data presented demonstrate that individual
behaviours in real life screening, for the interpretation of mammographic cases, affect film-reading accuracy on a test
set of mammograms for specificity and sensitivity (namely volume of cases read per week and film-reading
experience). The consequences for best screening practice, in real life, are considered.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The purpose of this study is to determine the relative effect of MTF, DQE, and pixel size on the shape of
microcalcifications in mammography. Two original images were obtained by a) scanning the film that accompanies an
RMI-156 phantom at a resolution of 25μm per pixel, b) creating an image with various shapes on a computer. Simulated
images were then obtained by changing MTF, adding noise to simulate DQE effects, and changing the resolution of the
original images. These images were visually evaluated to determine the recognition of the shape. In the evaluation of
400μm microcalcifications on the RMI-156 phantom, we found that shape recognition is maintained with a pixel size of
50μm or less regardless of MTF. However, at resolutions over 50μm, recognition was insufficient even when MTF was
increased. Adding noise decreased visibility but did not affect shape recognition. The same results were obtained using
computer-created shapes. The effect of pixel size on the recognition of the shape of microcalcifications was shown to be
greater compared to MTF and DQE. It was also found that increasing MTF does not compensate for information lost
because of enlarged pixel size.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Purpose:
1/ To validate a method for simulating microcalcifications in mammography
2/ To evaluate the effect of anatomical background on visibility of (simulated) microcalcifications
Materials and methods:
Microcalcifications were extracted from the raw data of specimen from a stereotactic vacuum needle biopsy. The sizes
of the templates varied from 200 μm to 1350μm and the peak contrast from 1.3% to 24%. Experienced breast imaging
radiologists were asked to blindly evaluate images containing real and simulated lesions. Analysis was done using ROC
methodology.
The simulated lesions have been used for the creation of composite image datasets: 408 microcalcifications were
simulated into 161 ROI's of 59 digital mammograms, having different anatomical backgrounds. Nine radiologists were
asked to detect and rate them under conditions of free-search. A modified receiver operating characteristic study
(FROC) was applied to find correlations between detectability and anatomical background.
Results:
1/ The calculated area under the ROC curve, Az, was 0.52± 0.04. Simulated microcalcifications could not be
distinguished from real ones.
2/ In the anatomical background classified as Category 1 (fatty), the detection fraction is the lowest (0.48), while for
type 2,3,4 there is a gradually decrease (from 0.61 to 0.54) as the glandularity increases. The number of false positives is
the highest for the background Category 1 (24%), compared to the other three types (16%). A 80% detectability is
found for microcalcifications with a diameter > 400μm and a peak contrast >10%. Anatomic noise seems to limit
detectability of large low contrast lesions, having a diameter >700μm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We report on the development of a novel software tool for the simulation of chest lesions. This software tool was developed for use in our study to attain optimal ambient lighting conditions for chest radiology. This study involved 61 consultant radiologists from the American Board of Radiology. Because of its success, we intend to use the same tool for future studies. The software has two main functions: the simulation of lesions and retrieval of information for ROC (Receiver Operating Characteristic) and JAFROC (Jack-Knife Free Response ROC) analysis. The simulation layer operates by randomly selecting an image from a bank of reportedly normal chest x-rays. A random location is then generated for each lesion, which is checked against a reference lung-map. If the location is within the lung fields, as derived from the lung-map, a lesion is superimposed. Lesions are also randomly selected from a bank of manually created chest lesion images. A blending algorithm determines which are the best intensity levels for the lesion to sit naturally within the chest x-ray. The same software was used to run a study for all 61 radiologists. A sequence of images is displayed in random order. Half of these images had simulated lesions, ranging from subtle to obvious, and half of the images were normal. The operator then selects locations where he/she thinks lesions exist and grades the lesion accordingly. We have found that this software was very effective in this study and intend to use the same principles for future studies.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The National Institute of Standards and Technology and the National Institutes of Health have started a collaborative study on the development of lighting that will provide enhanced, tissue-specific contrast with respect to its surroundings. In this paper we describe existing NIST technologies utilized for this project such as a computational model for color rendering and a new spectrally tunable lighting technology. We will also describe the calibration and validation procedure of a hyperspectral camera system. Finally, we show examples of imaged tissues under various lighting conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Monochrome monitors typically display 8 bits of data (256 shades of gray) at one time. This study determined if monitors that can display a wider range of grayscale information (11-bit) can improve observer performance and decrease the use of window/level in detecting pulmonary nodules. Three sites participated using 8 and 11-bit displays from three manufacturers. At each site, six radiologists reviewed 100 DR chest images on both displays. There was no significant difference in ROC Az (F = 0.0374, p = 0.8491) as a function of 8 vs 11 bit-depth. Average Az across all observers with 8-bits was 0.8284 and with 11-bits was 0.8253. There was a significant difference in overall viewing time (F = 10.209, p = 0.0014) favoring the 11-bit displays. Window/level use did not differ significantly for the two types of displays. Eye position recording on a subset of images at one site showed that cumulative dwell times for each decision category were lower with the 11-bit than with the 8-bit display. T-tests for paired observations showed that the TP (t = 1.452, p = 0.1507), FN (t = 0.050, p = 0.9609) and FP (t = 0.042, p = 0.9676) were not statistically significant. The difference for the TN decisions was statistically significant (t = 1.926, p = 0.05). 8-bit displays will not impact negatively diagnostic accuracy, but using 11-bit displays may improve workflow efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Clinical radiological judgments are increasingly being made on softcopy LCD monitors. These monitors are found throughout the hospital environment in radiological reading rooms, outpatient clinics and wards. This means that ambient lighting where clinical judgments from images are made can vary widely. Inappropriate ambient lighting has several deleterious effects: monitor reflections reduce contrast; veiling glare adds brightness; dynamic range and detectability of low contrast objects is limited. Radiological images displayed on LCDs are more sensitive to the impact of inappropriate ambient lighting and with these devices problems described above are often more evident.
The current work aims to provide data on optimum ambient lighting, based on lesions within chest images. The data provided may be used for the establishment of workable ambient lighting standards. Ambient lighting at 30cms from the monitor was set at 480 Lux (office lighting) 100 Lux (WHO recommendations), 40 Lux and <10 Lux. All monitors were calibrated to DICOM part 14 GSDF.
Sixty radiologists were presented with 30 chest images, 15 images having simulated nodular lesions of varying subtlety and size. Lesions were positioned in accordance with typical clinical presentation and were validated radiologically. Each image was presented for 30 seconds and viewers were asked to identify and score any visualized lesion from 1-4 to indicate confidence level of detection. At the end of the session, sensitivity and specificity were calculated. Analysis of the data suggests that visualization of chest lesions is affected by inappropriate lighting with chest radiologists demonstrating greater ambient lighting dependency. JAFROC analyses are currently being performed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Currently, as a rule, digital medical systems use monochromatic Liquid Crystal Display (LCD) monitors to ensure an accurate reproduction of the Grayscale Standard Display Function (GSDF) as specified in the Digital Imaging and Communications in Medicine (DICOM) Standard. As a drawback, special panels need to be utilized in digital medical systems, while it would be preferable to use regular color panels, which are manufactured on a wide scale and are thus available at by far lower prices. The method proposed introduces a temporal color dithering technique to accurately reproduce the GSDF on color monitors without losing monitor resolution. By exploiting the characteristics of the Human Visual System (HVS) the technique ensures that a satisfactory grayscale reproduction is achieved minimizing perceivable flickering and undesired color artifacts. The algorithm has been implemented in the monitor using a low-cost Field Programmable Gate Array (FPGA). Quantitative evaluations of luminance response on a 3 Mega-pixel color monitor have shown that the compliance with the GSDF can be achieved with the accuracy level required by medical applications. At the same time the measured color deviation is below the threshold perceivable by the human eye.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Radiological images are today mostly displayed on monitors, but much is still unknown regarding the interaction between monitor and viewer. Issues like monitor luminance range, calibration, contrast resolution and luminance distribution need to be addressed further. To perform vision research of high validity to the radiologists, test images should be presented on medical displays. One of the problems has been how to display low contrast patterns in a strictly controlled way. This paper demonstrates how to generate test patterns close to the detection limit on a medical grade display using subpixel modulation. Patterns are generated with both 8-bit and 10-bit monitor input. With this technique, up to 7162 luminance levels can be displayed and the average separation is approximately 0.08 of a JND (Just Noticeable Difference) on a display with a luminance range between 1 and 400 cd/m2. These patterns were used in a 2AFC detection task and the detection threshold was found to be 0.75 ± 0.02 of a JND when the adaptation level was the same as the target luminance (20 cd/m2). This is a reasonable result considering that the magnitude of a JND is based on the method of adjustment rather than on a detection task. When test patterns with a different luminance than the adaptation level (20 cd/m2) were displayed, the detection thresholds were 1.11 and 1.06 of a JND for target luminance values 1.8 and 350 cd/m2, respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Breast tomosynthesis is currently an investigational imaging technique requiring optimization of its many combinations
of data acquisition and image reconstruction parameters for optimum clinical use. In this study, the effects of several
acquisition parameters on the visual conspicuity of diagnostic features were evaluated for three breast specimens using a
visual discrimination model (VDM). Acquisition parameters included total exposure, number of views, full resolution
and binning modes, and lag correction. The diagnostic features considered in these specimens were mass margins,
microcalcifications, and mass spicules. Metrics of feature contrast were computed for each image by defining two
regions containing the selected feature (Signal) and surrounding background (Noise), and then computing the difference
in VDM channel metrics between Signal and Noise regions in units of just-noticeable differences (JNDs). Scans with
25 views and exposure levels comparable to a standard two-view mammography exam produced higher levels of feature
contrast. The effects of binning and lag correction on feature contrast were found to be generally small and isolated,
consistent with our visual assessments of the images. Binning produced a slight loss of spatial resolution which could
be compensated in the reconstruction filter. These results suggest that good image quality can be achieved with the
faster and therefore more clinically practical 25-view scans with binning, which can be performed in as little as 12.5
seconds. Further work will investigate other specimens as well as alternate figures of merit in order to help determine
optimal acquisition and reconstruction parameters for clinical trials.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
LCDs age, and as they do so the whitepoint shifts to a yellow hue. This changes the appearance of the displayed images. We examined whether this shift impacts observer performance and visual search efficiency of radiologists interpreting images. Six radiologists viewed 50 DR chest images on three LCDs that had their whitepoint adjusted to simulate monitor age (new, 1-year old, 2.5 years old). They reported the presence or absence of nodules along with their confidence. Visual search was measured on a subset of 15 images using eye position recording techniques. The results indicate that there was no statistically significant difference in ROC performance due to monitor age (F = 0.4901, p = 0.6187). There were not statistically significant differences between the three monitors in terms of total viewing time (F = 0.056, p = 0.9452). Dwell times for each decision type did not differ significantly as a function of monitor age. The shift in whitepoint towards the yellow range (at least up to 2.5 years of age ) does not impact diagnostic accuracy or visual search efficiency of radiologists.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This study evaluated the potential clinical utility of a high-performance (3 Mega-pixel) color display compared with two monochrome displays--one of comparable luminance (250 cd/m2) and one of higher luminance (450 cd/m2). Six radiologists viewed 50 DR chest images, half with nodules and half without, once on each display. Eye position was recorded on a subset of images. There was no statistically significant difference in ROC Az performance as a function of monitor (F = 1.176, p = 0.3127), although there was a clear trend towards the monochrome 450 cd/m2 monitor being better than the monochrome 250 cd/m2 monitor, which was better than the color monitor. In terms of total viewing time, there were no statistically significant differences between the three monitors (F = 1.478, p = 0.2298). The dwell times associated with true and false positive decisions were shortest for the high luminance monochrome display, longer for the low luminance monochrome, and longest for the low luminance color display. Dwells for the false negative decisions were longest for the high luminance monochrome display, shorter for the low luminance monochrome, and shortest for the low luminance color display. The true negative dwells were not significantly different. The study suggest high luminance displays may have an advantage in terms of diagnostic accuracy and visual search efficiency for detecting nodules in chest images compared to both monochrome and color lower luminance displays, although these differences may have little clinical impact because they are relatively small.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An observer performing a detection task analyzes an image and produces a single number, a test statistic, for
that image. This test statistic represents the observers "confidence" that a signal (e.g., a tumor) is present. The
linear observer that maximizes the test-statistic SNR is known as the Hotelling observer. Generally, computation
of the Hotelling SNR, or Hotelling trace, requires the inverse of a large covariance matrix. Recent developments
have resulted in methods for the estimation and inversion of these large covariance matrices with relatively
small numbers of images. The estimation and inversion of these matrices is made possible by a covariance matrix
decomposition that splits the full covariance matrix into an average detector-noise component and a
background-variability component. Because the average detector-noise component is often diagonal and/or
easily estimated, a full-rank, invertible covariance matrix can be produced with few images. We have studied
the bias of estimates of the Hotelling trace using this decomposition for high-detector-noise and low-detector noise
situations. In extremely low-noise situations, this covariance decomposition may result in a significant
bias. We will present a theoretical evaluation of the Hotelling-trace bias, as well as extensive simulation studies.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Any observer performing a detection task on an image produces a single number that represents the observer's
confidence that a signal (e.g., a tumor) is present. A linear observer produces this test statistic using a linear
template or a linear discriminant. The optimal linear discriminant is well-known to be the Hotelling observer
and uses both first- and second-order statistics of the image data. There are many situations where it is
advantageous to consider discriminant functions that adapt themselves to some characteristics of the data. In
these situations, the linear template is itself a function of the data and, thus, the observer is nonlinear. In this
paper, we present an example adaptive Hotelling discriminant and compare the performance of this observer to
that of the Hotelling observer and the Bayesian ideal observer. The task is to detect a signal that is imbedded in
one of a finite number of possible random backgrounds. Each random background is Gaussian but has different
covariance properties. The observer uses the image data to determine which background type is present and
then uses the template appropriate for that background. We show that the performance of this particular
observer falls between that of Hotelling and ideal observers.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this study we estimated human observer templates associated with the detection of a realistic mass signal
superimposed on real and simulated but realistic synthetic mammographic backgrounds. Five trained naïve observers
participated in two-alternative forced-choice (2-AFC) experiments in which they were asked to detect a spherical mass
signal extracted from a mammographic phantom. This signal was superimposed on statistically stationary clustered
lumpy backgrounds (CLB) in one instance, and on nonstationary real mammographic backgrounds in another. Human
observer linear templates were estimated using a genetic algorithm. An additional 2-AFC experiment was conducted
with twin noise in order to determine which local statistical properties of the real backgrounds influenced the ability of
the human observers to detect the signal.
Results show that the estimated linear templates are not significantly different for stationary and nonstationary
backgrounds. The estimated performance of the linear template compared with the human observer is within 5% in
terms of percent correct (Pc) for the 2-AFC task. Detection efficiency is significantly higher on nonstationary real
backgrounds than on globally stationary synthetic CLB.
Using the twin-noise experiment and a new method to relate image features to observers trial to trial decisions, we found
that the local statistical properties preventing or making the detection task easier were the standard deviation and three
features derived from the neighborhood gray-tone difference matrix: coarseness, contrast and strength. These statistical
features showed a dependency with the human performance only when they are estimated within an area sufficiently
small around the searched location. These findings emphasize that nonstationary backgrounds need to be described by
their local statistics and not by global ones like the noise Wiener spectrum.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Previously, a non-prewhitening matched filter (NPWMF) incorporating a model for the contrast sensitivity of the
human visual system was introduced for modeling human performance in detection tasks with different viewing
angles and white-noise backgrounds by Badano et al. But NPWMF observers do not perform well detection
tasks involving complex backgrounds since they do not account for random backgrounds. A channelized-Hotelling
observer (CHO) using difference-of-Gaussians (DOG) channels has been shown to track human performance well
in detection tasks using lumpy backgrounds. In this work, a CHO with DOG channels, incorporating the model
of the human contrast sensitivity, was developed similarly. We call this new observer a contrast-sensitive CHO
(CS-CHO). The Barten model was the basis of our human contrast sensitivity model. A scalar was multiplied
to the Barten model and varied to control the thresholding effect of the contrast sensitivity on luminance-valued
images and hence the performance-prediction ability of the CS-CHO. The performance of the CS-CHO was
compared to the average human performance from the psychophysical study by Park et al., where the task
was to detect a known Gaussian signal in non-Gaussian distributed lumpy backgrounds. Six different signal-intensity
values were used in this study. We chose the free parameter of our model to match the mean human
performance in the detection experiment at the strongest signal intensity. Then we compared the model to the
human at five different signal-intensity values in order to see if the performance of the CS-CHO matched human
performance. Our results indicate that the CS-CHO with the chosen scalar for the contrast sensitivity predicts
human performance closely as a function of signal intensity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Active-matrix liquid crystal displays (LCDs) are becoming widely used in medical imaging applications. With
the increasing volume of CT images to be interpreted per day, the ability of showing a fast sequence of images
in stack mode is preferable for a medical display. Slow temporal response of LCD display can compromise the
image quality/fidelity when the images are browsed in a fast sequence. In this paper, we report on the effect
of the LCD response time at different image browsing speeds based on the performance of a contrast-sensitive
channelized-Hotelling observer. A correlated stack of simulated cluster lumpy background images with a signal
present in some of the images was used. The effect of different browsing speeds is calculated with LCD temporal
response measurements established in our previous work. The image set is then analyzed by the model observer,
which has been shown to predict human detection performance in non-Gaussian lumpy backgrounds. This allows
us to quantify the effect of slow temporal response of medical liquid crystal displays on the performance of the
anthropomorphic observer. Slow temporal response of the display device greatly affects the lesion contrast and
observer performance. This methodology, after validation with human observers, could be used to set limits for
the rendering speed of large volumetric image datasets (from CT, MR, or tomosynthesis) read in stack-mode.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Model observers have been used successfully to predict human observer performance and to evaluate image quality for
detection tasks on various backgrounds in medical applications. This paper will apply the closed-form compression noise
statistics in analytic form to model observers and the derived channelized Hotelling observer (CHO) for decompressed
images. The performance of CHO on decompressed images is validated using JPEG compression algorithm and lumpy background
images. The results show that the derived CHO performance predicts closely its simulated performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The purpose of this paper is to describe FROC (free-response receiver operating characteristic) curves predicted by a recent model of visual search. The model is characterized by three parameters (μ, λ and ν) which quantify perceived lesion signal-to-noise ratio, the average number of non-lesion locations per image considered for marking by the observer, and the probability that a lesion is considered for marking, respectively. An important characteristic of a search-model predicted FROC curve is that it is contained within the rectangle with corners at (0, 0) and (λ, ν). It is shown that λ and ν determine the x and y end-point coordinates of the FROC curve, respectively, and mu determines the sharpness of the transition from vertical slope at the origin to zero slope at (λ, ν). Two figures of merit (FOM) quantifying free-response performance are described. A FOM commonly used by CAD developers is the ordinate of the FROC curve at a specified abscissa. Another FOM, recently introduced by us, measures the ability of the observer to discriminate between normal and abnormal images. The latter is analogous to the Az measure widely used in ROC methodology. The search-model is related to the initial detection and candidate analysis (IDCA) method of fitting FROC curves but a key assumption, the shapes of the fitted curves and the estimation methods are different. The search-model yielded excellent fits to a designer level and to a simulated clinical level CAD data set. Available software implementing these ideas is expected to aid in the optimization of CAD algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
If the locations of abnormalities (targets) in an image are unknown, the evaluation of human observers' detection
performance can be complex. Richard Swensson in 1996 developed a model that unified the various analysis
approaches to this problem. For the LROC experiment, the model assumed that a false-positive report-arises from the
latent decision variable of the most suspicious non-target location of the target stimuli. The localization scoring was
based on the same latent decision variable, i.e., when the latent decision variable at the non-target location was greater
than latent decision variable at the target location the response was scored as a miss. Human observer reports vary, i.e.,
different locations have been identified during replications. A Monte Carlo model was developed to investigate this
variation and identified a non-intuitive aspect of Swensson's LROC model. When the number of potentially suspicious
locations was 1, the model performance was greater than apparently possible. For example, assume that target expected
latent decision variable is 1.0. Both target and non-target standard deviations were assumed to be 1.0. The model
predicts the area-under-the-ROC is 0.815, which implies da=1.27. If the target latent decision variable was 0.0, then
da=0.61. The reason was the number latent decision variables in the model for the non-target stimuli is one, while the
number latent decision variables for the target stimuli is the maximum of 2. The simulation indicated that the
parameters of a LROC fit, when the number of suspicious locations is small or the observer performance is low, does
not have the same intuitive meaning as ROC parameters of a SKE task.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In medical imaging, signal detection is one of the most important tasks. It is especially important
to study detection tasks with signal location uncertainty. One way to evaluate system performance
on such tasks is to compute the area under the localization-receiver operating characteristic (LROC)
curve. In an LROC study, detecting a signal includes two steps. The first step is to compute a test
statistic to determine whether the signal is present or absent. If the signal is present, the second step
is to identify the location of the signal. We use the test statistic which maximizes the area under the
LROC curve (ALROC). We attempt to capture the distribution of this ideal LROC test statistic with
signal-absent data using the extreme value distribution. Some simulated test statistics are shown along
with extreme value distributions to illustrate how well our approximation captures the characteristics
of the ideal LROC test statistic. We further derive an approximation to the ideal ALROC using the
extreme value distribution and compare it to the direct simulation of the ALROC. Using a different
approach by defining a parameterized probability density function of the data, we are able to derive
another approximation to the ideal ALROC for weak signals from a power series expansion in signal
amplitude.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Maximum likelihood estimation of receiver operating characteristic (ROC) curves using the "proper" binormal model
can be interpreted in terms of Bayesian estimation as assuming a flat joint prior distribution on the c and da parameters.
However, this is equivalent to assuming a non-flat prior distribution for the area under the curve (AUC) that peaks at
AUC = 1.0. We hypothesize that this implicit prior on AUC biases the maximum likelihood estimate (MLE) of AUC.
We propose a Bayesian implementation of the "proper" binormal ROC curve-fitting model with a prior distribution that
is marginally flat on AUC and conditionally flat over c. This specifies a non-flat joint prior for c and da. We developed
a Monte Carlo Markov chain (MCMC) algorithm to estimate the posterior distribution and the maximum a posteriori
(MAP) estimate of AUC. We performed a simulation study using 500 draws of a small dataset (25 normal and 25
abnormal cases) with an underlying AUC value of 0.85. When the prior distribution was a flat joint prior on c and da, the MLE and MAP estimates agreed, suggesting that the MCMC algorithm worked correctly. When the prior
distribution was marginally flat on AUC, the MAP estimate of AUC appeared to be biased low. However, the MAP
estimate of AUC for perfectly separable degenerate datasets did not appear to be biased. Further work is needed to
validate the algorithm and refine the prior assumptions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A flexible software tool was developed that combines predictive models for detector noise and blur with image
simulation and an improved human observer model to predict the clinical task performance of existing and future
radiographic systems. The model starts with high-fidelity images from a database and mathematical models of common
disease features, which may be added to the images at desired contrast levels. These images are processed through the
entire imaging chain including capture, the detector, image processing, and hardcopy or softcopy display. The simulated
images and the viewing conditions are passed to a human observer model, which calculates the detectability index d' of
the signal (disease or target feature). The visual model incorporates a channelized Hotelling observer with a luminance-dependent
contrast sensitivity function and two types of internal visual system noise (intrinsic and image background-induced).
It was optimized based on three independent human observer studies of target detection, and is able to predict
d' over a wide range of viewing conditions, background complexities, and target spatial frequency content. A more
intuitive metric of system performance, Task-Specific Detective Efficiency (TSDE), is defined to indicate how much
detector improvements would translate to better radiologist performance. The TSDE is calculated as the squared ratio of
d' for a system with the actual detector and a hypothetical system containing an ideal detector. A low TSDE, e.g., 5% for
the detection of 0.1 mm microcalcifications in typical mammography systems, indicates that improvements in the
detector characteristics are likely to translate to better detection performance. The TSDE of lung nodule detection is as
high as 75% even with the detective quantum efficiency (DQE) of the detector not exceeding 24%. Applications of the
model to system optimizations for flat-panel detectors, in mammography and dual energy digital radiography, are
discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Some diagnostic tasks in MRI involve determining the presence of a faint feature (target) relative to a dark
background. In MR images produced by taking pixel magnitudes it is well known that the contrast between faint
features and dark backgrounds is reduced due to the Rician noise distribution. In an attempt to enhance detection
we implemented three different MRI reconstruction algorithms: the normal magnitude, phase-corrected real, and
a wavelet thresholding algorithm designed particularly for MRI noise suppression and contrast enhancement.
To compare these reconstructions, we had volunteers perform a two-alternative forced choice (2AFC) signal
detection task. The stimuli were produced from high-field head MRI images with synthetic thermal noise added
to ensure realistic backgrounds. Circular targets were located in regions of the image that were dark, but
next to bright anatomy. Images were processed using one of the three reconstruction techniques. In addition
we compared a channelized Hotelling observer (CHO) to the human observers in this task. We measured the
percentage correct in both the human and model observer experiments.
Our results showed better performance with the use of magnitude or phase-corrected real images compared
to the use of the wavelet algorithm. In particular, artifacts induced by the wavelet algorithm seem to distract
some users and produce significant inter-subject variability. This contradicts predictions based only on SNR.
The CHO matched the mean human results quite closely, demonstrating that this model observer may be used
to simulate human response in MRI target detection tasks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Signal detection by the channelized Hotelling (ch-Hotelling) observer is studied for tomographic
application by employing a small, tractable 2D model of a computed tomography (CT) system.
The primary goal of this manuscript is to develop a practical method for evaluating the ch-Hotelling
observer that can generalize to larger 3D cone-beam CT systems. The use of the ch-Hotelling observer for evaluating tomographic image reconstruction algorithms is also demonstrated. For a realistic model for CT, the ch-Hotelling observer can be a good approximation to the ideal observer.
The ch-Hotelling observer is applied to both the projection data and the reconstructed images. The difference in signal-to-noise ratio for signal detection in both of these domains provides a metric for evaluating the image reconstruction algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
There is an extraordinary number of fast MR imaging techniques, especially for parallel imaging. When one considers
multiple reconstruction algorithms, reconstruction parameters, coil configurations, acceleration factors, noise levels, and
multiple test images, one can easily create 1000's of test images for image quality evaluation. We have found the
perceptual difference model (Case-PDM) to be quite useful as a means of rapid quantitative image quality evaluation in
such experiments, and have applied it to keyhole, spiral, SENSE, and GRAPPA applications. In this study, we have
compared human evaluation of MR images from multiple organs and from multiple image reconstruction algorithms to
Case-PDM. We compared human DSCQS (Double Stimulus Continuous Quality Scale) scoring against Case-PDM
measurements for 3 different image types and 3 different image reconstruction algorithms. We found that Case-PDM
linearly correlated (r > 0.9) with human subject ratings over a very large range of image quality. We also compared
Case-PDM to other image quality evaluation methods. Case-PDM generally performed better than NASA's DCTune,
MITRE's IQM, Zhou Wang's NR models and mean square error (MSE) method, by showing a higher Pearson
correlation coefficient, higher Spearman rank-order correlation and lower root-mean-squared error. All three models
(Case-PDM, Sarnoff's IDM, and Zhou Wang's SSIM) performed very similarly in this experiment. To focus on high
quality reconstructions, we performed a 2-AFC (Alternate Forced Choice) experiment to determine the "just perceptible
difference" between two images. We found that threshold Case-PDM scores changed little (0.6-1.8) with 2 different
image types and 3 degradation patterns, and results with Case-PDM were much tighter than the other methods (IDM and
MSE) by showing a lower ratio of mean to standard deviation value. We conclude that Case-PDM can correctly predict
the ordering of image quality over a large range of image quality. Case-PDM can also be used to screen the images
which are "perceptually equal" to the original image. Although Case-PDM is a very useful tool for comparing "similar
raw images with similar processing," one should be careful when interpreting Case-PDM scores across MR images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The ideal observer (IO) employs complete knowledge of the available data statistics and sets an upper limit on the observer performance on a binary classification task. Kupinski proposed an IO estimation method using Markov chain Monte Carlo (MCMC) techniques. In principle, this method can be generalized to any parameterized phantoms and simulated imaging systems. In practice, however, it can be computationally burdensome, because it requires sampling the object distribution and simulating the imaging process a large number of times during the MCMC estimation process. In this work we propose methods that allow application of MCMC techniques to cardiac SPECT imaging IO estimation using a parameterized torso phantom and an accurate analytical projection algorithm that models the SPECT image formation process. To accelerate the imaging simulation process and thus enable the MCMC IO estimation, we used a phantom model with discretized anatomical parameters and continuous uptake parameters. The imaging process simulation was modeled by pre-computing projections for each organ in the finite number of discretely-parameterized anatomic models and taking linear combinations of the organ projections based on sampling of the continuous organ uptake parameters. The proposed method greatly reduces the computational burden and makes MCMC IO estimation for cardiac SPECT imaging possible.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recent developments in low-noise, large-area CCD detectors have
renewed interest in radiographic systems that use a lens to couple
light from a scintillation screen to a detector. The lenses for
this application must have very large numerical apertures and high
spatial resolution over a FOV. This paper expands on our earlier
work by applying the principles of task-based assessment of image
quality to development of meaningful figures of merit for the
lenses.
The task considered in this study is detection of a lesion in a
mammogram, and the figure of merit used is the lesion detectability,
expressed as a task-based signal-to-noise ratio (SNR), for a
channelized Hotelling observer (CHO). As in the previous work, the
statistical model accounts for the random structure in the breast,
the statistical properties of the scintillation screen, the random
coupling of light to the CCD, the detailed structure of the
shift-variant lens point spread function (PSF), and Poisson noise of
the X-ray flux.
The lenses considered range from F/0.9 to F/1.2. All yield nominally
the same spot size at a given field. Among the F/0.9 lenses, some of
them were designed by conventional means for high resolution and
some for high contrast, and the shapes of the PSF differ
considerably. The results show that excessively large lens numerical
apertures do not improve the task-based SNR but dramatically
increase the optics fabrication cost. Contrary to common wisdom,
high-contrast designs have higher task-based SNRs than
high-resolution designs when the signal is small. Additionally, we
constructed a merit function to successfully tune the lenses to
perform equally well anywhere in the FOV.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To accurately detect radiological signs of cancer, mammography requires the best possible image quality for a target patient dose. The application of automatic optimization of parameters (AOP) to digital systems has been improved recently. The metric used to derive this AOP was based on the expected CNR of calcium material in a uniform background. In this work, we use a new metric, based on the detection performance of an a-contrario observer on lesions in simulated images. Breast images at various thicknesses and glandularity levels were simulated with flat and textured backgrounds. Various exposure spectra (Mo/Mo, Mo/Rh and Rh/Rh anode/filter materials, kVp ranging from 25 to 33 kV) were considered. The tube output has been normalized in order to obtain comparable AGD values for each image of a given breast over the various acquisition techniques. Images were scored with the a-contrario observer, the performance criterion being the minimal lesion size needed to reach a given detection threshold. The optimal spectra are found similar to those delivered by the AOP in both flat and textured backgrounds. The choice of the anode/filter combination appears to be more critical than kVp adjustments in particular for the thicker breasts. Our approach also yields an estimate of the detection variability due to texture signal. We found that the anatomical structure variability cannot be overcome by beam quality optimization of the current system in presence of complex background, which confirms the potential benefit of any imaging technology reducing the variability of detection due to texture.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This study evaluates new observer models for 3D whole-body Positron Emission Tomography (PET)
imaging based on a wavelet sub-band decomposition and compares them with the classical constant-Q CHO model. Our
final goal is to develop an original method that performs guided detection of abnormal activity foci in PET oncology
imaging based on these new observer models. This computer-aided diagnostic method would highly benefit to clinicians
for diagnostic purpose and to biologists for massive screening of rodents populations in molecular imaging. Method:
We have previously shown good correlation of the channelized Hotelling observer (CHO) using a constant-Q model
with human observer performance for 3D PET oncology imaging. We propose an alternate method based on combining
a CHO observer with a wavelet sub-band decomposition of the image and we compare it to the standard CHO
implementation. This method performs an undecimated transform using a biorthogonal B-spline 4/4 wavelet basis to
extract the features set for input to the Hotelling observer. This work is based on simulated 3D PET images of an
extended MCAT phantom with randomly located lesions. We compare three evaluation criteria: classification
performance using the signal-to-noise ratio (SNR), computation efficiency and visual quality of the derived 3D maps of
the decision variable &lgr;. The SNR is estimated on a series of test images for a variable number of training images for
both observers. Results: Results show that the maximum SNR is higher with the constant-Q CHO observer, especially
for targets located in the liver, and that it is reached with a smaller number of training images. However, preliminary
analysis indicates that the visual quality of the 3D maps of the decision variable &lgr; is higher with the wavelet-based
CHO and the computation time to derive a 3D &lgr;-map is about 350 times shorter than for the standard CHO. This
suggests that the wavelet-CHO observer is a good candidate for use in our guided detection method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For more than one decade computer aided detection (CAD) for
pulmonary nodules has been an active research area. There are
numerous publications dedicated to this topic. Most authors have
created their own database with their own ground truth for
validation. This makes it hard to compare the performance of
different systems with each other. It is a known fact that the
performance of a CAD system can differ significantly depending on
which data it is tested and on the underlying ground truth. The
lung image data base consortium (LIDC) has recently released 93
publicly available lung images with ground truth lists from 4
different radiologists. This data base will make it possible to
compare the performance of different CAD algorithms. In this paper
we do the first step to use the LIDC data as a benchmark test.
We present a CAD algorithm with a validation study on these data
sets. The CAD performance was analyzed by virtue of multiple Free
Response Receiver Operator Characteristic (FROC) curves for
different lower thresholds of the nodule diameter. There are
different ways to merge the ground truth lists of the
4 radiologists and we discuss the performance of our CAD algorithm
for several of these possibilities. For nodules with a
volume-equivalent diameter ≥4mm which have been
simultaneously confirmed by all four radiologists our CAD system
shows a detection rate of 89 % at a median false positive rate
of 2 findings per patient.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The purpose of this study was to evaluate the effect of computer-aided diagnosis (CAD) on radiologists' performance for
the detection of lung nodules on thoracic CT scans. Our computer system was designed using an independent training set
of 94 CT scans in our laboratory. The data set for the observer performance study consisted of 48 CT scans. Twenty
scans were collected from patient files at the University of Michigan, and 28 scans by the Lung Imaging Database
Consortium (LIDC). All scans were read by multiple experienced thoracic radiologists to determine the true nodule
locations, defined as any region identified by one or more expert radiologists as containing a nodule larger than 3 mm in
diameter. Eighteen CT examinations were nodule-free, while the remaining 30 CT examinations contained a total of 73
nodules having a median size of 5.5 mm (range 3.0-36.4 mm). Four other study radiologists read the CT scans first
without and then with CAD, and provided likelihood of nodule ratings for suspicious regions. Two of the study
radiologists were fellowship trained in cardiothoracic radiology, and two were cardiothoracic radiology fellows. Freeresponse
receiver-operating characteristic (FROC) curves were used to compare the two reading conditions. The
computer system had a sensitivity of 79% (58/73) with an average of 4.9 marks per normal scan (88/18). Jackknife
alternative FROC (JAFROC) analysis indicated that the improvement with CAD was statistically significant (p=0.03).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this study we developed an effective novel method for reducing the variability in the output of different artificial neural network (ANN) configurations that have the same overall performance as measured by the area under their receiver operating characteristic (ROC) curves. This variability can lead to inaccuracies in the interpretation of results when the outputs are employed as classification predictors. We extended a method previously proposed to reduce the variability in the performance of a classifier with data sets from different institutions to the outputs of ANN configurations. Our approach is based on histogram shaping of the outputs of all ANN configurations to resemble the output histogram of a baseline ANN configuration. We tested the effectiveness of the technique using synthetic data generated from two two-dimensional isotropic Gaussian distributions and 100 ANN configurations. The proposed output calibration technique significantly reduced the median standard deviation of the ANN outputs from 0.010 before calibration to 0.006 after calibration. The standard deviation of the sensitivity of the 100 ANN configurations at the same decision threshold reduced significantly from 0.005 before calibration to 0.003 after calibration. Similarly the standard deviation of their specificity values decreased significantly from 0.016 before calibration to 0.006 after calibration.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A new study supports and expands upon a previous reporting that computed radiography (CR) mammography offers as
good, or better, image quality than state-of-the-art screen/film mammography. The suitability of CR mammography is
explored through qualitative and quantitative study components: feature comparison and cancer detection rates of each
modality. Images were collected from 150 normal and 50 biopsy-confirmed subjects representing a range of breast and
pathology types. Comparison views were collected without releasing compression, using automatic exposure control on
Kodak MIN-R films, followed by CR. Digital images were displayed as both softcopy (S/C) and hardcopy (H/C) for the
feature comparison, and S/C for the cancer detection task. The qualitative assessment used preference scores from five
board-certified radiologists obtained while viewing 100 screen/film-CR pairs from the cancer subjects for S/C and H/C
CR output. Fifteen general image-quality features were rated, and up to 12 additional features were rated for each pair,
based on the pathology present. Results demonstrate that CR is equivalent or preferred to conventional mammography
for overall image quality (89% S/C, 95% H/C), image contrast (95% S/C, 98% H/C), sharpness (86% S/C, 93% H/C),
and noise (94% S/C, 91% H/C). The quantitative objective was satisfied by asking 10 board-certified radiologists to
provide a BI-RADSTM score and probability of malignancy per breast for each modality of the 200 cases. At least 28
days passed between observations of the same case. Average sensitivity and specificity was 0.89 and 0.82 for CR and
0.91 and 0.82 for screen/film, respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Evaluation of imaging hardware represents a vital component of system design. In small-animal SPECT
imaging, this evaluation has become increasingly diffcult with the emergence of multi-pinhole apertures
and adaptive, or patient-specific, imaging. This paper will describe two methods for hardware evaluation
using reconstructed images. The first method is a rapid technique incorporating a system-specific non-linear,
three-dimensional point response. This point response is easily computed and offers qualitative insight into
an aperture's resolution and artifact characteristics. The second method is an objective assessment of signal
detection in lumpy backgrounds using the channelized Hotelling observer (CHO) with 3D Laguerre-Gauss and
difference-of-Gaussian channels to calculate area under the receiver-operating characteristic curve (AUC).
Previous work presented at this meeting described a unique, small-animal SPECT system (M3R) capable of
operating under a myriad of hardware configurations and ideally suited for image quality studies. Measured
system matrices were collected for several hardware configurations of M3R. The data used to implement
these two methods was then generated by taking simulated objects through the measured system matrices.
The results of these two methods comprise a combination of qualitative and quantitative analysis that is
well-suited for hardware assessment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The purpose of this study was to develop a concise way to summarize radiographic contrast detail curves.
We obtained experimental data that measured lesion detection in CT images of a 5-year-old
anthropomorphic phantom. Five lesion diameters (2.5 to 12.5 mm) were investigated, and contrast detail
(CD) curves were generated at each of five tube current-exposure time product (mAs) values using twoalternative
forced-choice (2-AFC) studies. A performance index for each CD curve was calculated as the
area under the curve bounded by the maximum and minimum lesion sizes, with this value being normalized
by the range of lesion sizes used. We denote this quantity, which is mathematically equal to the mean
value of the CD curve, as the contrast-detail performance index (PCD). This quantity is inspired by the area
under the curve (Az) that is used as a performance index in ROC studies, though there are important
differences. PCD, like Az, allows for the reduction in the dimensionality of experimental results, simplifying
interpretation of data while discarding details of the respective curve (CD or ROC). Unlike Az, PCD
decreases with increasing performance, and the range of values is not fixed as for Az (i.e. 0 < Az < 1). PCD
is proportional to the average SNR for the lesions used in the 2-AFC experiments, and allows relative
performance comparisons as experimental parameters are changed. For the CT data analyzed, the PCD
values were 0.196, 0.166, 0.146, 0.132, and 0.121 at mAs values of 30, 50, 70, 100, and 140, respectively.
This corresponds to an increase in performance (i.e. decrease in required contrast) relative to the 30 mAs
PCD value of 62%, 48%, 33%, and 18% for the 140, 100, 70, and 50 mAs data, respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We developed image presentation software that mimics the functionality available in the clinic, but also records time-stamped, observer-display interactions and is readily deployable on diverse workstations making it possible to collect comparable observer data at multiple sites. Commercial image presentation software for clinical use has limited application for research on image perception, ergonomics, computer-aids and informatics because it does not collect observer responses, or other information on observer-display interactions, in real time. It is also very difficult to collect observer data from multiple institutions unless the same commercial software is available at different sites. Our software not only records observer reports of abnormalities and their locations, but also inspection time until report, inspection time for each computed radiograph and for each slice of tomographic studies, window/level, and magnification settings used by the observer. The software is a modified version of the open source ImageJ software available from the National Institutes of Health. Our software involves changes to the base code and extensive new plugin code. Our free software is currently capable of displaying computed tomography and computed radiography images. The software is packaged as Java class files and can be used on Windows, Linux, or Mac systems. By deploying our software together with experiment-specific script files that administer experimental procedures and image file handling, multi-institutional studies can be conducted that increase reader and/or case sample sizes or add experimental conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this work we focus on the generation of reliable ground truth data for a large medical repository of digital
cervicographic images (cervigrams) collected by the National Cancer Institute (NCI). This work is part of an
ongoing effort conducted by NCI together with the National Library of Medicine (NLM) at the National Institutes
of Health (NIH) to develop a web-based database of the digitized cervix images in order to study the evolution
of lesions related to cervical cancer. As part of this effort, NCI has gathered twenty experts to manually segment
a set of 933 cervigrams into regions of medical and anatomical interest. This process yields a set of images
with multi-expert segmentations. The objectives of the current work are: 1) generate multi-expert ground truth
and assess the diffculty of segmenting an image, 2) analyze observer variability in the multi-expert data, and
3) utilize the multi-expert ground truth to evaluate automatic segmentation algorithms. The work is based on
STAPLE (Simultaneous Truth and Performance Level Estimation), which is a well known method to generate
ground truth segmentation maps from multiple experts' observations. We have analyzed both intra- and inter-expert
variability within the segmentation data. We propose novel measures of "segmentation complexity" by
which we can automatically identify cervigrams that were found difficult to segment by the experts, based on
their inter-observer variability. Finally, the results are used to assess our own automated algorithm for cervix
boundary detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
PURPOSE: Detection of coronary artery calcifications (CAC) using conventional chest radiographs has a high positive
predictive value but low sensitivity for coronary artery disease. We investigated the role of dual energy imaging to
enhance reader performance in the detection of CAC, indicative of atherosclerotic plaques.
METHODS: A sample of 53 patients with CT documented CAC and 23 patients without CT evidence of CAC, was
imaged using a dual energy protocol on an amorphous silicon flat panel system (Revolution XR/d, GE Medical
Systems). The acquisition sequence consisted of a 60kVp ("low energy") exposure, followed by a 120 kVp ("high
energy") exposure with a time separation of 150ms. Subsequent image processing yielded conventional PA and lateral
radiographs and a subtracted PA "bone image". For all patients and both data sets, CAC were evaluated by two
experienced board-certified thoracic radiologists via Likert scale measurement (1-5 score).
RESULTS: Sensitivity for CAC detection, using conventional radiographs, was 34.0% and 56.6% while specificity was
96.6% and 91.3%, for the two readers respectively. Using the "bone images", sensitivity was 92.4% and 83.0% while
specificity was 100% and 91.3%. For patients with verified CAC, "bone images" resulted in at least a one Likert score
increase in 73.6% and 54.7% of cases for the two readers.
CONCLUSION: We conclude that using dual energy technology, "bone images" may allow higher sensitivity in
detecting CAC compared with conventional radiographs, without decreased specificity. Thus, we believe our findings
are useful in defining a role for dual energy subtraction radiography in improved detection of coronary artery disease.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Increasing transmission of medical images across multiple user systems raises concerns for image security. Hiding watermark information in medical image data files is one solution for enhancing security and privacy protection of data. Medical image watermarking however is not a widely studied area, due partially to speculations on loss in viewer performance caused by degradation of image information. Such concerns are addressed if the amount of information lost due to watermarking can be kept at minimal levels and below visual perception thresholds. This paper describes experiments where three alternative visual quality metrics were used to assess the degradation caused by watermarking medical images. Magnetic Resonance Imaging (MRI) and Computed Tomography (CT) medical images were watermarked using different methods: Block based Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) with various embedding strengths. The visual degradation of each watermarking parameter setting was assessed using Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Measure (SSIM) and Steerable Visual Difference Predictor (SVDP) numerical metrics. The suitability of each of the three numerical metrics for medical image watermarking visual quality assessment is noted. In addition, subjective test results from human observers are used to suggest visual degradation thresholds.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Five radiologists detected suspicious mass regions depicted on mammograms acquired from 32 examinations during this pilot study. Among these, 24 examinations depicted subtle masses (12 malignant and 12 benign) and 8 were negative. Each observer interpreted a case in a sequential order under three reading modes. In mode one, the observer interpreted images without viewing CAD-generated cues. The observer provided two likelihood scores (for detection and classification) for each identified suspicious region. In mode two, CAD-cued results were provided and the observer could decide whether to make any changes in the previous ratings. In mode three, each observer was forced to query at least one suspected region. Once a region was queried, CAD scheme automatically segmented the mass region and computed a set of image features. Using a conditioned k-nearest neighbor (KNN) algorithm, six reference regions that were considered "the most similar" to the queried region were selected and displayed along with CAD-generated scores. Again, the observer had an option to change previous ratings. Experimental results were analyzed using ROC method. Five observers marked total 271, 276, and 281 mass regions under the three reading modes, respectively. In mode 2 observers marked 5 new suspected mass regions and did not make any changes in previously rated detection or classification scores. In mode three, although observers queried 18 additional regions, 13 were discarded and 5 were marked with region specific related scores. The observers also changed previous rating scores of 28 mass regions marked during mode one. The areas under ROC curves for individual readers ranged from 0.51 to 0.71 for mass detection (p = 0.67) and from 0.50 to 0.73 for mass classification (p = 0.43). This pilot study suggested that using ICAD could increase radiologists' confidence in their decision making. We also found that because radiologists tend to accept a higher false-positive rate in a laboratory environment, once they made their detection decision during the initial reading, they are frequently reluctant to make changes during the following modes. Hence, while simple and efficient operationally, the sequential reading mode may not be an optimal approach to evaluate the actual utility of ICAD.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We are comparing the performance of computer-aided detection (CAD) used as a second reader to concurrent-use CAD.
We have designed a multi-reader multi-case (MRMC) observer study using fixed-size mammographic background
images with fixed intensity Gaussian signals added in two experiments. A CAD system was developed to automatically
detect these signals. The two experiments utilized signals of different contrast levels to assess the impact of CAD when
the standalone CAD sensitivity was superior (low contrast) or equivalent (high contrast) to the average reader in the
study. Seven readers participated in the study and were asked to review 100 images, identify signal locations, and rate
each on a 100-point scale. A rating of 50 was used as a cutpoint and provided a binary classification of each candidate.
Readers read the case set using CAD in both the second-reader and concurrent-reader scenarios. Results from the
different signal intensities and reading paradigms were analyzed using the area under the Free-response Receiver
Operating Characteristics curves. Sensitivity and the average number of FPs/image were also determined. The results
showed that CAD, either used as a second reader or as a concurrent reader, can increase reader sensitivity but with an
increase in FPs. The study demonstrated that readers may benefit from concurrent CAD when CAD standalone
performance outperforms average reader sensitivity. However, this trend was not observed when CAD performance
was equivalent to the sensitivity of the average reader.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The contrast-to-noise ratio (CNR) is often used as a physical evaluation parameter for low-contrast resolution in computed tomography (CT). However, CNR is not affected by the window conditions. This study proposes a new physical evaluation method for low-contrast resolution that takes into account changes in window conditions. This new parameter is called the gray-scale contrast-to-noise ratio (GSCNR) was assessed and was compared with CNR.
For each reconstruction image, the window width (WW) was varied from 100 to 400 in steps of 100 while keeping the window level (WL) fixed, and CNR and GSCNR were calculated. WL was then varied from 0 to 100 in steps of 20 while keeping WW fixed, and CNR and GSCNR were calculated again.
CNR did not vary with WW, but it varied inversely with the standard deviation (SD) of the CT number (from 2.2 for an SD of 7 to 1.4 for an SD of 16). In contrast, GSCNR decreased with the increase in WW for each SD. In addition, GSCNR did not vary with WL, but it varied inversely with SD.
GSCNR was found to be a useful physical evaluation parameter and was also thought to be useful for optimizing the window conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Technological developments of computed tomography (CT) have led to an increase of its clinical utilization. To optimize patient dose and image quality, scanner manufacturers have introduced X-ray tube current modulation coupled to Automatic Exposure Control (AEC) devices. The purpose of this work was to assess the performance of the CT-AEC of three different MSCT manufacturers by means of two phantoms: a conical PMMA phantom to vary the thickness of the absorber in a monotonous way, and an anthropomorphic chest phantom to assess the response of the CT-AEC in more realistic conditions. Noise measurements were made by standard deviation assessments, and dose indicators (CTDIvol and DLP) were calculated. All scanners were able to compensate for thickness variation by an adaptation of tube current. Initial current adaptation lengths varied for all systems in the range of 1 to 5 cm. With the anthropomorphic phantom, noticeable differences appeared concerning the adaptation rapidity in a sudden X-ray attenuation change, and non-intuitive behavior of current evolution was noticed for some acquisitions. The xyz-modulation allowed to reduce the DLP of the acquisition by 18% compared to the z-modulation. It is also showed that a homogeneous test object is not sufficient to characterize CT-AEC devices.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Accurate longitudinal measurements are essential in understanding treatment effectiveness. Miss-calibrated gradients, low acquisition bandwidth, or abnormally high B0 inhomogeneities may cause geometric distortions in the MRI, which in turn may affect the imaging based biomarkers used to understand disease progression. This work presents the behavior of several MRI sites over an average period of 12 months using the analysis of a volume and linearity phantom with known geometry. The phantom was scanned in the axial, coronal, and sagittal planes using a T2 FSE sequence. For each month's scan, the average phantom length was measured in the right/left, anterior/posterior, and superior/inferior directions. The distortion variation was measured in each gradient axis and orientation over time. Results show that some magnets exhibit a significant drift within the scanning period. Unless this type of distortion is considered, the treatment efficacy outcome may be annulled due to misleading and erroneous conclusions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An observer study was conducted on a randomly selected sampling of 152 digital projection radiographs of varying
body parts obtained from four medical institutions for the purpose of assessing a new workflow-efficient imageprocessing
framework. Five rendering treatments were compared to measure the performance of a new processing
algorithm against the control condition. A key feature of the new image processing is the capability of processing without
specifying the exam. Randomized image pairs were presented at a softcopy workstation equipped with two diagnosticquality
flat-panel monitors. Five board-certified radiologists and one radiology resident independently reviewed each
image pair blinded to the specific processing used and provided a diagnostic-quality rating using a subjective rank-order
scale for each image. In addition, a relative preference rating was used to indicate rendering preference. Aggregate results
indicate that the new fully automated processing is preferred (sign test for median = 0 (α = 0.05): p < 0.0001 preference
in favor of the control).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
At modest compression ratios, lossy compression schemes allow substantial image size reduction without a significant loss in visual information. This is a consequence of the coding engines' transformation (such as the Discrete Cosine Transfomation (DCT) and the Discrete Wavelet Transform (DWT) in combination with quantization and truncation operations which all exploit the characteristics of the human visual system to achieve file-size reduction. The objective of our study was to determine levels of lossy compression that can be confidently used in diagnostic imaging. We conducted an extensive clinical evaluation using a standardized methodology incorporating two recognized evaluation techniques: Diagnostic Accuracy with Receiver Operating Characteristic (ROC) Analysis and Original-Revealed Forced Choice. Images covering 5 modalities and 7 anatomical regions were compressed at 3 different levels using JPEG and JPEG 2000 compression algorithms.
To enable radiologists across Canada to evaluate images for our study, we developed a dedicated software application that was synchronized to a centralized server; which allowed results were reported, in real-time, to the central database via the Internet.
In order to obtain findings that were relevant to everyday clinical evaluation, images were not viewed under a strict laboratory environment, but rather they were read under typical viewing conditions that comply with current standards of practice.
We present here the methodology and specific technology developed for the purpose of this study, we explain the specific problems that we have encountered during the implementation and we give preliminary results.
Our preliminary findings suggest that the most appropriate compression algorithm and compression ratios are largely dependent on the image specifics including the type/ modality and anatomical region studied.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.