<p>Several autofocus algorithms based on the analysis of image sharpness have been proposed for microscopy applications. Since autofocus functions (AFs) are computed from several images captured at different lens positions, these algorithms are considered computationally intensive. With the aim of presenting the capabilities of dedicated hardware to speed-up the autofocus process, we discuss the implementation of four AFs using, respectively, a multicore central processing unit (CPU) architecture and a graphic processing unit (GPU) card. Throughout different experiments performed on 300 image stacks previously identified with tuberculosis bacilli, the proposed implementations have allowed for the acceleration of the computation time for some AFs up to 23 times with respect to the serial version. These results show that the optimal use of multicore CPU and GPUs can be used effectively for autofocus in real-time microscopy applications.</p>
Phase unwrapping is an important problem in the areas of optical metrology, synthetic aperture radar (SAR) image analysis, and magnetic resonance imaging (MRI) analysis. These images are becoming larger in size and, particularly, the availability and need for processing of SAR and MRI data have increased significantly with the acquisition of remote sensing data and the popularization of magnetic resonators in clinical diagnosis. Therefore, it is important to develop faster and accurate phase unwrapping algorithms. We propose a parallel multigrid algorithm of a phase unwrapping method named accumulation of residual maps, which builds on a serial algorithm that consists of the minimization of a cost function; minimization achieved by means of a serial Gauss–Seidel kind algorithm. Our algorithm also optimizes the original cost function, but unlike the original work, our algorithm is a parallel Jacobi class with alternated minimizations. This strategy is known as the chessboard type, where red pixels can be updated in parallel at same iteration since they are independent. Similarly, black pixels can be updated in parallel in an alternating iteration. We present parallel implementations of our algorithm for different parallel multicore architecture such as CPU-multicore, Xeon Phi coprocessor, and Nvidia graphics processing unit. In all the cases, we obtain a superior performance of our parallel algorithm when compared with the original serial version. In addition, we present a detailed comparative performance of the developed parallel versions.