Most prior 2AFC experiments have been designed using a small number of signal strengths with many scenes for each strength. Percent correct is then computed for each level and fit to the assumed psychometric function. However, this introduces error because the signal strengths of individual responses are shifted. An alternative approach is to compute the statistical likelihood as a function of the threshold and width of the psychometric response curve. The best fit is then determined by finding the threshold and width that maximize the likelihood. In this paper, we discuss a method for analyzing 2AFC observer responses using maximum likelihood estimation (MLE) techniques. The logit model is used to represent the psychometric function and derive the likelihood. A conjugate gradient search algorithm is then used to find the maximum likelihood. The method is illustrated using human observer results from a previous study while statistical characteristics of the method are examined using simulated response data. The human observer results show that the psychometric function varies between observers and from test to test. The simulations show that the variance of the threshold and width exhibit a 1/Nobs relationship (σ=1.5201*Nobs-0.5236), where Nobs is the number of observations made in a 2AFC test ranging from 10 to 30000. The variance of the human observer data was in close agreement with the simulations. These results indicate that the method is robust over a wide range of observations and can be used to predict human responses. The results of the simulations also suggest how to minimize error in future studies.
The DICOM Gray Scale Display Function (GSDF) relates display contrast to the contrast threshold derived from the Barton Model (CBM) of the human visual system. We have measured the contrast threshold (CT) using a monochrome medical LCD monitor and graphics card under the conditions defined by the DICOM standard and compared the results to the Barten Model. A two Alternative Forced Choice (2AFC) observer performance test was used to measure contrast threshold. The 2AFC tests were given once to a large group of observers with varied medical imaging experience. A small subset of this group was tested multiple times over several months in order to examine intraobserver variability. The mean relative contrast (CT/CBM) associated with a 75% detection rate was found to be 0.508, with a standard deviation of 0.176. For the intraobserver tests, results improved after the first 3 trials. The mean CT/CBM values (and standard deviation) for the next 9 tests were 0.0980 (0.107), 0.244 (0.0928), and 0.398 (0.0855). The results indicate that contrast substantially less than 1 CT/CBM is detected based on the statistical criteria used. This can be explained based on the criteria for detection used in the classical observer tests that form the basis for the Barten model. Additionally, our data indicates significant differences amongst observers.
Grayscale medical monitors are commonly calibrated by transforming the image display values sent to a graphic controller using a lookup table (LUT). The calibration LUT is deduced from the uncalibrated luminance response (uLR) of the display system. The uLR of liquid crystal display (LCD) systems is poorly behaved with significant discontinuities occurring in the relative luminance changes. Accurate grayscale calibration of LCD devices thus requires a measurement of the luminance for the full palette of possible output values. A method is reported to acquire the uLR of LCD displays, generate a LUT to achieve precise calibration, and assess the accuracy of the calibration results. A palette of 766 luminance values can be measured in 12 minutes. The accuracy of the method permits the evaluation of relative luminance changes, dL/L, to be made with a precision of .0002 to .0007 for luminance values between 1000 and 1 cd/m2. For seven LCD monitors, 766 values for the uLR were measured and calibration tables deduced. The calibrated luminance response (cLR) for 256 gray values was then compared to the DICOM standard. The root mean squared error of the observed JNDs per luminance interval values ranged from .37 to .59 which is less than the AAPM recommended value of 1.0. A full calibration of this type should be done at installation. However, the stability of LCD systems suggests that periodic recalibration will not be necessary.