Robust still image watermarks are evaluated in terms of image fidelity and robustness. We extend this framework and apply reliability testing to robust still image watermark evaluation. Reliability is the probability that a watermarking algorithm will correctly detect or decode a watermark for a specified fidelity requirement under a given set of attacks and images. In reliability testing, a system is evaluated in terms of quality, load, capacity, and performance. To measure quality that corresponds to image fidelity, we compensate for attacks to measure the fidelity of attacked watermarked images. We use a conditional mean of pixel values to compensate for valumetric attacks such as gamma correction and histogram equalization. To compensate for geometrical attacks, we use error concealment and perfect motion estimation assumption. We define capacity to be the minimum embedding strength parameter and the maximum data payload that meet a specified error criteria. Load is then defined to be the actual embedding strength and data payload of a watermark. To measure performance, we use bit error rate (BER) and receiver operating characteristics (ROCs) of a watermarking algorithm for different attacks and images. We evaluate robust watermarks for various loads, attacks, and images.