Observers participating in ROC studies are usually required to estimate the confidence with which each
observation is made. With a discrete scale, the rating, or score, normally falls into one of 5categories, ranging from
'definitely normal' to 'definitely abnormal'. However, a major problem in data analysis from ROC studies has been
found to be caused by observers who have not used the rating scale in a uniform manner, and have made many
responses corresponding to the two extreme categories with few responses falling in the middle. The use of a
continuous rating scale, with a point selected using a mouse, has assisted in analysis, but only to a limited extent.
It has therefore been suggested elsewhere that it is desirable to force observers to select intermediate points. The
effect of such an approach on ROC curves was studied by asking a group of observers to re-score a set of difficult
clinical images, after training and with continuous feedback on their compliance. Although the resulting fall in the
ROC curves was not statistically significant, it is considered unwise to force observers to report in what to them
appears to be an unnatural manner.