Adaptive thresholding is a useful technique for document analysis. In medical image processing, it is also helpful for segmenting structures, such as diaphragms or blood vessels. This technique sets a threshold using local information around a pixel, then binarizes the pixel according to the value. Although this technique is robust to changes in illumination, it takes a significant amount of time to compute thresholds because it requires adding all of the neighboring pixels. Integral images can alleviate this overhead; however, medical images, such as ultrasound, often come with image masks, and ordinary algorithms often cause artifacts. The main problem is that the shape of the summing area is not rectangular near the boundaries of the image mask. For example, the threshold at the boundary of the mask is incorrect because pixels on the mask image are also counted. Our key idea to cope with this problem is computing the integral image for the image mask to count the valid number of pixels. Our method is implemented on a GPU using CUDA, and experimental results show that our algorithm is 164 times faster than a naïve CPU algorithm for averaging.