17 July 2012 Anisotropy-based robust focus measure for non-mydriatic retinal imaging
Author Affiliations +
Non-mydriatic retinal imaging is an important tool for diagnosis and progression assessment of ophthalmic diseases. Because it does not require pharmacological dilation of the patient's pupil, it is essential for screening programs performed by non-medical personnel. A typical camera is equipped with a manual focusing mechanism to compensate for the refractive errors in the eye. However, manual focusing is error prone, especially when performed by inexperienced photographers. In this work, we propose a new and robust focus measure based on a calculation of image anisotropy which, in turn, is evaluated from the directional variance of the normalized discrete cosine transform. Simulation and experimental results demonstrate the effectiveness of the proposed focus measure.



Ocular fundus imaging has long played a key role in the documentation, diagnosis, and progression assessment of ophthalmic diseases. With the advent of digital cameras, ophthalmic imaging changed dramatically. Among the advantages of digital imaging are the ease and speed of access to data, fast and exact duplication, archiving and transmission, and digital image analysis techniques. Altogether, these advantages set the foundations for modern ophthalmology in the framework of telemedicine.

Fundus cameras can be mydriatic or non-mydriatic. Mydriatic fundus cameras require pharmacological dilation, while non-mydriatic cameras use a near-infra-red (NIR) viewing system to exploit the patient’s natural dilation in a dark room.1 Infra-red light is used to preview the retina on a video monitor. Once the monitor’s image is focused and aligned, a flash of visible light from a Xenon arc lamp is fired and the image is captured. Non-mydriatic fundus cameras are equipped with a focusing mechanism that displaces a compensation lens. It is basically an aspheric objective lens design that, when combined with the optics of the eye, matches the image plane to the eye fundus. The focus control of the fundus camera is used to compensate for refractive errors in the subject’s eye. Until recently,2 these cameras were entirely operated manually with the focusing mechanism assisted by a split-line visual aid. Manual focusing is error prone, especially when performed by inexperienced photographers, and may lead to images that require additional restoration or enhancement.3 The auto-focus feature offered in new retinal cameras is a significant advance that ultimately leads to a more robust imaging system, especially for medical screening purposes. On the other hand, the auto-focus feature still relies on the split-line mechanism, whereas in this work we propose a passive focus measure (FM) completely based on image analysis. For further details on fundus imaging, the reader is referred to Refs. 1, 4, and 5.

In Ref. 6 we studied the performance of several state-of-the-art no-reference image quality metrics for eye fundus imaging. The most interesting finding relates to the importance of directional properties with image quality. In other words, the measure of anisotropy as a quality metric. This was proposed by two co-authors of this paper (Gabarda and Cristóbal) in Ref. 7 and it represents an important step forward in the area of no-reference quality metrics. However, given the properties of the NIR fundus focusing system, the FM should be robust to noise (spatial and temporal) and to changes in illumination and contrast. Furthermore, real-time imaging is constrained by the overhead required to compute the directional Rényi entropy as described in Ref. 7. Therefore, in this work we made use of our findings on the directional dependency of eye fundus images against defocus to define a new and robust FM based on the directional variance of the normalized discrete cosine transform (DCT). The FM proposed here could impact the design of portable retinal cameras with autofocus function or the manufacturing of low-cost retinal cameras because they would require fewer optical components.



In a single-lens optical imaging system operating within the paraxial regime, the process of focusing consists in adjusting the relative position of the object, the lens, the image sensor, or a certain combination of the three to obtain a focused image. Let f(x,y) be the focused image of a planar object and gi(x,y) a sequence of images recorded for a sequence of camera parameter settings. The eye fundus is actually a curved surface; however, in our case f(x,y) corresponds to a small region of the fundus so it can be considered as an isoplanatic patch.8 We consider the variation of only one camera parameter at a time—either the lens position or the focal length. The acquired set of images can be expressed by convolution


where hi(x,y) is the point spread function (PSF) of the blur in the ith observation. In a practical imaging system, the image magnification and mean image brightness may change while focusing even if nothing has changed in the scene. Normalization with respect to these two parameters can be carried out. However, illumination normalization is more easily performed. Image magnification may be neglected because in most practical applications the magnification is less than 3%.9 Ideally, the best possible case occurs when hi(x,y)=δ(x,y), therefore gi(x,y)=f(x,y). In practice, all hi(x,y) have an unknown low-pass filter effect.

An FM may be understood as a functional defined on the image space which reflects the amount of blurring introduced by hi(x,y). Let S be the FM with which we look for the “best” image by maximizing/minimizing S(gi) over i=1,,m. A reasonable FM should be monotonic with respect to blur and robust to noise. Groen et al.10 used eight different criteria for the evaluation of focus functions. Ideally, the focus function should be unimodal, but in practice it can present various local maxima that can affect the convergence of the auto-focus procedure. Moreover, the focus curve should ideally be sharp at the top and long tailed, which can accelerate the convergence of the screening procedure.


Related Works

Various FMs have been reported in the literature.2,9,1112.13 They mainly consist of a focus-measuring operator that estimates the sharpness of the image. The image that yields a maximum FM is considered as the focused one. Almost all FMs depend directly on the amount of high-frequency information in the image. The high-frequency components correspond to edge information. On the other hand, their accuracy can deviate depending on the content of the processed images. Because well-focused images have sharper edges, they are expected to have higher-frequency content than blurred ones.13 The common FMs are based on norm of gradient or second derivative of the image, gray level variance, and Laplacian energy. Surprisingly, little is known about the performance of these methods for fundus imaging, and the literature on this subject is scarce.

To the best of our knowledge, there exist only two published works that deal with autofocusing in retinal imaging;2,14 however, they use conventional mydriatic imaging in the visible spectrum, which is not our case. In Ref. 14 they do not propose a FM; instead they use several preprocessing operations to improve the performance of traditional FMs for segmentation purposes. On the other hand, in the recent work by Moscaritolo et al.,2 they propose a filtering technique to assess sharpness of optic nerve head images; however, they do not compare with other methods. In this section we briefly summarize five notable approaches—including that of Moscaritolo et al.2—for later comparison with our proposed method.

The first FM S1 was proposed in Ref. 2. It may be defined mathematically as


where zlp is a low-pass filtering of gi(x,y), zmed is a nonlinear median filter of the absolute value |.| of the difference for removing noise, and Var(.) is the variance. Another important measure is the l2-norm of image gradient, also called the energy of gradient, defined as


The third measure is the Laplacian energy. It can analyze high frequencies associated with image edges and is calculated as


Nayar and Nakagawa15 proposed a noise-insensitive FM based on the summed modified Laplacian operators. When two second partial derivatives with respect to horizontal and vertical directions have different signs, one offsets the other and the evaluated focus value is incorrect. The method is a modification to obtain the absolute value of each second partial derivative as


The frequency-selective weighted median (FSWM) filter16 is a high-pass nonlinear filter based on the difference of medians. It is well known as a nonlinear edge detector that removes impulsive noise effectively. The FSWM uses several nonlinear subfilters having a weight according to the frequency acting like a bandpass filter as


where zF(x) is the FSWM filter, P is the number of subfilters, βjR, and z^j(x) is the weighted median filter. The FM is produced by summing FSWM results, Fx and Fy, applied to an image along the horizontal and vertical directions as



Subbarao and Tyan17 analyzed the robustness of three FMs: the image variance (not included here), S2, and S3. They recommended using S3 because of its tolerance to additive noise; however, the differences among individual measures were not significant. There are many other FMs, such as the wavelet-based FM proposed in Ref. 12 or the mid-frequency discrete cosine FM in Ref. 11, but they were not included in our study either because of their lack of robustness to noise or their complex implementation. For a review and evaluation of FMs in natural images, the reader is referred to Refs. 9 and 13.


Discrete Cosine Transform

DCT is an invertible, linear transformation T:RNRN. An image is transformed to its spectral representations by projection onto a set of of orthogonal two-dimensional (2-D) basis functions. The amplitudes of these projections are called the DCT coefficients. Let g(x,y), for x=0,1,2,,M1 and y=0,1,2,,N1, denote an M×N image and its DCT denoted by T[g(x,y)]:G(u,v), given by the equation




where A={M,N} depending on variables u and v, respectively. Low-order basis functions represent low spatial frequencies, while those of higher orders represent the higher spatial frequencies (Fig. 1). Therefore, low-order coefficients depict slow spatial variations in image intensity, while those of higher orders depict rapid variations.

Fig. 1

Relationship between DCT coefficients and frequency components of an image.


The DCT is closely related to the discrete Fourier transform (DFT), a standard tool in signal processing, and has been reported as a suitable transform for spectral-based focusing algorithms.18 However, the DCT has a greater energy compaction property than the DFT, i.e., most of the image information tends to be concentrated in a few low-frequency DCT coefficients. This is also why the JPEG compression standard is based on the DCT. In addition, many efficient schemes for the computation of DCT exist,19 and hardware implementations are commonly available.20


Normalized DCT

The normalized DCT21 of an image is defined as


This normalization is important because it leads to invariance to changes in the contrast of the image. This can be shown with the following: let g(x,y)=cg(x,y), where c is a non-zero scaling factor. Given that the DCT is linear, the normalized DCT of g is


which implies that the normalized DCT is contrast invariant and any measure based on this transform as well.

For illustrating the nature of blurring and the behavior of the DCT, we take the red channel from a sharp RGB fundus image (because it resembles more the NIR image) and simulate the imaging system as a linear shift-invariant system to acquire a sequence of images by varying the lens position. This was carried out by means of Fresnel propagation. In Fig. 2, we show the original sharp image, image patches of both the sharp and blurred images, and their DCT spectra (in the same log scale). Notice how the spectrum changes—there is less high- and mid-frequency content in the blurred image spectrum. In addition, in the original spectrum there are some favored orientations in the mid- and low-frequency coefficients, while in the blurred spectrum they seem to become more uniformly distributed. Another important feature is that in the blurred spectrum the coefficients related to high frequency have decreased significantly, and, as described in Sec. 2, many FMs are actually based on the idea of emphasizing high frequencies. While this may be true in theory, in practice there will always be noise contributing to the high-frequency content due to different acquisition conditions. Furthermore, given that the focusing mechanism involves acquiring a sequence of images, there will be spatial and temporal variations of noise.

Fig. 2

(a), Original sharp fundus image (R channel from RGB fundus image). (b), ROI from sharp image, and (c), its DCT spectrum. (d), ROI from blurred image, and (e), its DCT spectrum. For visualization purposes both spectra are shown in log scale. Coefficients with higher values are shown in red and those with lower values are shown in blue. The blurred image spectrum is dominated by low-order coefficients.



Focus Measure


Measure of Anisotropy

As we have seen in the previous example, the overall nature of blurring can be described as a low-pass filtering that tends to break down the characteristic anisotropy of the original image. The FM proposed here aims to quantify this anisotropic dependence based on the normalized DCT of the image.

To define our measure, we introduce some notation. From Eq. (10), G˜(u,v) is the normalized DCT of g(x,y) of size N×N, and λj, for j=1,2,3, is a vector along one of the three main orientations of the spectrum depicted in Fig. 3. We will restrict our study to angular partitions of the spectrum roughly equivalent to vertical, diagonal, and horizontal components of the image space. Our measure of anisotropy mainly consists in calculating a difference of weighted coefficients along these orientations. Let G˜j={G˜(u,v):θ=arctan(vu),θjθ<θj+1,j=1,2,3} be the set of DCT coefficients located between θj and θj+1 angles, for θj{0deg,30deg,60deg,90deg}. The function ψλj(.) takes as input Gj˜, performs orthogonal projection of all its elements along vector λj, and averages the elements that after projection fall on the same discrete (u,v) coordinates. With ψλj(.) we seek to compact the information around the three main orientations in a one-dimensional vector of N elements. To illustrate, let us compute ψλ1(G1˜)=[ψλ11,ψλ12,,ψλ1N]T, where G1˜ is the set of coefficients located between θ1=0deg and θ2=30deg. In Fig. 3(b), we show the projection of the coefficient with coordinates (4,2) along λ1. After projection, this coefficient has coordinates (4,1). Therefore, the element ψλ14=mean[G˜(4,1),G˜(4,2)]. Consequently, we can stack all ψλj to form the following matrix,

Note that the first element of each vector corresponds to the dc coefficient. This coefficient does not convey any directional information of the image; however, we decided to keep it in the matrix for the sake of completeness. To obtain a measure of anisotropy—the FM itself—from Ψ we compute the variance of the weighted sum of the columns, computed as the matrix product wΨ,


where w=[w1,w2,,wN], E is the expected value, and μ is the mean of the matrix product wΨ. The vector w can be regarded as a weighting procedure and with it we aim to achieve robustness to noise and illumination variation.

Fig. 3

(a) Vectors along the main directions of the DCT. (b) Projection of a coefficient along λ1.



DCT Coefficient Weighting

The first issue to address is the selection of a suitable w. In DCT-based pattern recognition, robustness is achieved by means of coefficient truncation.22 It is known that low frequencies are related to illumination variation and smooth regions, and high frequencies represent noise as well as small variations (like edge and details) of the image. The middle-frequency coefficients contain useful information of basic structure; therefore these are suitable candidates for recognition.23 Consequently, a trade-off between low-frequency and high-frequency truncation should be achieved to obtain a robust FM that is monotonic with respect to blur, unimodal, and at the same time robust to noise and illumination variations.

We decided to find a w that meets our requirements based on a training set of m images. This can be formulated as an optimization problem. The goal would be to find the vector w=[w1,w2,,wN] that simultaneously optimizes K objective values {J1(w),J2(w),,JK(w)}. Every objective value Jk(w) is formulated so that the FM Sa decreases with respect to blur, Sa(gik)>Sa(gi+1k)i=1,,m. There are K subsets of gi(x,y) all generated in the same way as described in Eq. (1), but they differ in that every k stands for a different kind of noise degradation, except for k=1, the noise-free case. In other words, we want to find a w that guarantees monotonicity of Sa with respect to blur under different types of noise. The objective values are implicitly defined in terms of permutations of the ordered set H={Sa(g1),Sa(g2),,Sa(gm)}. Thus, the reference permutation is πr={1,2,,m}, and any other arbitrary permutation of H violates the decreasing property of Sa with respect to blur. As a result, our goal is to find a w that produces permutations πk for all K types of noise equal to that of πr. The objective value is defined as the l1-norm of the difference between πr and πk,


It is zero for two identical permutations, and approaches zero as πk approaches πr. This is the same for all Jk(w); hence our single aggregate objective function24 is the weighted linear sum of all Jk(w), where all weights are equal to 1.

The solution to this problem is not a straightforward task, as the search space is multivariate and a unique global optimum cannot be guaranteed to exist. Therefore, we solved it using a probabilistic metaheuristic approach called simulated annealing.25 It provides an acceptably good solution in a fixed amount of time. Each step of the algorithm replaces the current solution by a random nearby solution, chosen with a probability that depends both on the difference between the corresponding function values and on a global parameter T (called the temperature), that is gradually decreased during the process. The dependency is such that the current solution changes almost randomly when T is large, but increasingly downhill as T goes to zero. (For further details see Ref. 24.)



Typically, FMs are applied to a region called the focusing window, which is much smaller than the image. To achieve real-time computation, we decided to implement our measure by dividing the focusing window into subwindows. The measure is computed in the following manner:

  • 1. The focusing window is divided into non-overlapping sub-images of size 16×16. This is chosen so that the most basic structures of the image fit in the subwindows.

  • 2. Each subwindow image is transformed with the normalized DCT, and the FM Sa is computed.

  • 3. An overall FM Sa¯ is computed by taking the mean of all Sa values from the subwindows.

According to this implementation, the parameter w consists of 16 elements. The considered noise degradations for the procedure described in Sec. 4.2 are Gaussian noise (σ2=0.001), speckle noise (σ2=0.001), and impulsive noise (d=0.01). The resulting w is shown in Fig. 4. As expected, the first two coefficients are practically zero. Observe the distribution of w instead of the individual values per coefficient. This means that a strong emphasis should be put to mid-frequency coefficients. It is perhaps not surprising that the distribution resembles a bandpass filter. This finding is consistent with the work in Ref. 9, where they showed that bandpass filtering causes the FMs to have sharp peaks while retaining monotonicity and unimodality. Interestingly, these weights also resemble the band pass response of the contrast sensitivity function of the human visual system. In the DCT domain, different approaches have been considered for computing visually optimized coefficients for a given image.26 A major feature of our approach is the fast computation of the FM. The average execution time per frame, in MATLAB implementation on a PC with a 2.66-GHz Intel Core 2 Duo processor, is 40 ms. In most cases this is sufficient; however, if needed, implementation in a low-level programming language could significantly reduce the execution time. In addition, because we divide the focusing window into subwindows, our implementation could be further improved by taking advantage of large parallel architectures such as in graphics processor unit computing.

Fig. 4

DCT coefficient weights obtained from the optimization procedure. The distribution resembles a bandpass filter.





Simulated Images and Robustness Assessment

To evaluate the robustness of our proposed FM Sa we simulated the focusing procedure. We generated a sequence gi(x,y) for i=1,,m from the red channel of a sharp RGB fundus image and propagated it at different distances through a linear imaging system of fixed focal length by means of Fresnel propagation. This is equivalent to displacing the lens or the sensor to look for the optimal focus position. From this noise-free sequence, we generated six additional sequences by corrupting it with two levels of three different types of noise: Gaussian, speckle, and impulse noise. We carried out this procedure for 20 retinal images for a total of 140 focusing sequences. Ideally, a noise-robust FM should produce the same (or similar) focusing curve for both the noise-free and corrupted sequences. To quantify the similarity between two focusing curves Sr and Sc we used the zero-lag normalized cross-correlation defined as


where r stands for the reference curve computed from the noise-free sequence and c the curve computed from the noise-corrupted sequence. The output is 1 in the case of perfect correlation and 0 for no correlation at all. The reason for the zero-lag calculation, as opposed to the regular cross-correlation by sliding dot product, is that we need the maxima of the curves to coincide in the horizontal position as well as the matching of the profiles.

All FMs were computed using a focusing window of 128×128pixels located over retinal structures. In Fig. 5, we show an example to illustrate the robustness assessment of the FMs. The FM curves represent the normalized measure value over the search space for different lens positions. The highest value should be obtained when the lens is in the optimal focus position identified by the dashed vertical line. As the lens gets farther from the optimal position, the measure value should decrease proportionally to the distance. It comes as no surprise that all measures performed sufficiently well in the noise-free sequence shown in Fig. 5(a), where all curves follow a typical bell shape with a unique maximum. However, in the curves shown in Fig. 5(b)5(d) where the focusing sequence is corrupted by different types of noise, the proposed FM Sa clearly outperforms the other measures in terms of monotonicity and unimodality. Notice that under Gaussian and speckle noise [Fig. 5(b)5(c)], the Sa curves are nearly identical to the noise-free Sa curve in Fig. 5(a). Without jumping to conclusions, this result is interesting because it graphically shows the robustness of the proposed FM. The results for all of the 140 sequences are summarized in Table 1. Each value represents the average cross-correlation obtained for all 20 sequences corrupted with a specified type and level of noise for a particular FM. The overall average for each FM is shown in the last column. These results provide further evidence that the proposed FM Sa has a considerable robustness to noise, with an overall performance value of 0.929 and an exceptional 0.996 for the sequence corrupted with Gaussian noise with σ2=0.001. The second and third best FMs were S4 and S1, with overall values of 0.781 and 0.502, respectively. In comparison with Sa these values represent a moderate to mild noise robustness. In the following section, we use these two FMs to compare with Sa in real images.

Fig. 5

Focus measures curves for the simulated images. The dashed vertical line indicates the correct focused position. (a) Noise-free images. (b) Gaussian noise (σ2=0.001). (c) Speckle noise (σ2=0.001). (d) Impulse noise (d=0.01).


Table 1

Average normalized cross-correlation results for noise robustness assessment of focus measures from 140 sequences generated from 20 retinal images corrupted with different types and levels of noise. The three best FMs are in bold type.

Gaussian (σ2=0.001)Gaussian (σ2=0.005)Speckle (σ2=0.001)Speckle (σ2=0.005)Impulse (d=0.01)Impulse (d=0.05)Overall average


Real Images

In this subsection we show the results obtained from real NIR-focusing eye fundus images. The images have a relatively low signal to noise ratio (SNR) which justifies the need for a robust FM. All images were acquired using a digital fundus camera system (TRC-NW6S, Topcon, Tokyo, Japan) taking the video output from the infrared focusing system with a resolution of 640×480. The focusing system enables a compensation range of 13D12D in normal operation. For strong myopia or hyperopia, two additional compensation lenses are available to compensate the ranges: 12D33D and +9D+40D, respectively. The image sequences analyzed here were acquired by means of an in-house assembled motor mechanism for the displacement of the compensation lens across the whole range for normal operation.

It is well known that as a person ages the crystalline lens of the eye gradually gets opacified, obstructing the passage of light. This is called a cataract. A complete loss of transparency is observed only in advanced stages in untreated patients. In early stages of cataracts retinal examination is considered practicable—however, it is not without difficulty. For this reason, we decided to test our focusing method on healthy young subjects and elderly subjects with first signs of cataracts, not only to demonstrate its applicability on real images, but to assess its limitations as well. In this work we show results from five representative subjects of ages 27, 40, 68, 70, and 81 years for a total number of 10 eye fundi.

First we show the effects of placing the focusing window on different regions of the retinal image. A retinal image has distinct sharp structures such as the blood vessels and the optic disk, as opposed to the relatively uniform background. No FM is reliable without placing the focusing window on top of structures with edges, a fact easily appreciable from the three focusing curves shown in Fig. 6, which were computed from the right eye fundus of the 27-year-old subject. The optimal focus position identified by the dashed vertical line was verified via the split-line focusing mechanism. The Sa curves computed from regions (b) and (c) are clearly reliable in terms of monotonicity and unimodality and coincide on the optimal focus position. Conversely, the S1 and S4 curves fail to produce a reliable profile against the Sa curves that display a steeper peak at the optimal focus position, evidence of the measure’s robustness to noise. In contrast, all measures computed from region (d) are unusable because they are mainly given by noise.

Fig. 6

Focus measure curves obtained by placing the focusing window over different regions of (a) the retinal image. The dashed vertical line indicates the correct focused position. Areas (b) and (c) are located over prominent retinal structures, whereas (d) is located over a relatively uniform region.


To illustrate the link between the focusing curves and the image quality, in Fig. 7 we show three image details depicting the optic disk region for three different focusing positions. The image detail in Fig. 7(b) corresponds to the focused image (optimal focus position 11 in the Sa curve Fig. 6). Notice how this image is properly focused: it has sharp details such as the blood vessels. The other two images are blurred, demonstrating the consistency of the Sa curves with image quality or sharpness. The result that emerges from this example is that to effectively locate the best-focused image, homogeneous regions should be avoided. An adaptive technique, based in an edge detector for example, could prove useful for detecting such prominent structures and therefore candidate regions for applying the focusing technique automatically. The focusing curves shown hereafter, however, were all computed from a focusing window located manually over retinal structures.

Fig. 7

Image detail from Fig. 6 for different focusing positions. (a) 6, (b), 11 (optimal focus), and (c), 15. The positions are in reference to Fig. 6(b)6(c).


To further analyze the performance of the FM in Fig. 8, we show the focusing curves obtained from four of the five subjects; the ages are shown in the figure caption. From the four cases shown only in one [Fig. 8(c)], the Sa measure peak did not coincide precisely with the optimal focus position. However, the error is no more than a single position. The FMs curves of S1 and S4 are generally flatter than those of Sa which in a focus search strategy is not wanted because of the difficulty to properly distinguish the optimum position in a coarse or initial search. From the curves in Fig. 8, we can also note that there appears to be little difference between the curves from young and elderly subjects. In Fig. 9, we show the focusing curves obtained from the 81-year-old subject for both eye fundi. This case is interesting on its own because in the right eye [Fig. 9(a)], the crystalline lens has been extracted and replaced with an intraocular lens, whereas the left eye [Fig. 9(b)], is in an early stage of cataract. While both focusing curves are able to successfully identify the optimal focus position, the curve in Fig. 9(b) is certainly flatter throughout most of the search space. This is most likely due to the difference in visibility and clarity from both eyes. In general, from the comparison against S1 and S4 it can clearly be stated that the proposed FM Sa outperforms them in the considered cases.

Fig. 8

Focusing curves obtained from four subjects with ages 27 (a) 40 (b) 68 (c), and 70 (d) years. The dashed vertical line indicates the correct focused position.


Fig. 9

Focusing curves obtained from the 81-year-old subject for each eye fundus. In the right eye (a), the crystalline lens has been extracted and replaced with an intraocular lens. The left eye (b), is in an early stage of cataract. The dashed vertical line indicates the correct focused position.


A close examination of the results reveal that the shape of the focusing curve is not exclusively given by the degree of defocus, but by the state of the subject’s eye and the analyzed region of the fundus as well. This is important because it conditions the strategy for searching the optimal focus position. Finally, even though the results seem to indicate that the FM could be successfully applied to both young and elderly subjects, further research on a higher number and variety of subjects is necessary. Additionally, we report here that we encountered some difficulty in the procedure with the elderly subjects related to sustaining fixation during the acquisition procedure. From an initial number of six subjects one was excluded from all calculations due to this condition. Patient inability to successfully establish fixation is a true challenge in fundus photography, and dealing with it is out of the scope of this work.



In this paper, a new robust FM for nonmydriatic retinal imaging has been proposed. It is based on a measure of anisotropy, mainly the weighted directional variance of the normalized DCT. The weights were calculated by means of an optimization procedure to maximize the noise robustness of the FM. Not only were the resulting weights in agreement with previous works,9 but they also provide a key insight into the design of noise-invariant FMs. By both simulation and real fundus imaging, we demonstrated the robustness and the accuracy of the proposed FM, which clearly outperformed the other considered measures. The findings presented here may have a number of implications for the design and operation of auto-focusing in modern retinal cameras. Finally, in this study we included several young and elderly subjects to assess the limitations of the proposed FM. Even though we found no significant differences between the focusing curves, there was some difficulty in the acquisition of images from the elderly mainly given by inability to sustain fixation. As with all such studies, there are limitations that offer opportunities for further research. Adapting our method to these variations within the patient population is a goal worth pursuing.


This research has been partly funded by the Spanish Ministerio de Economía y Competitividad and Fondos FEDER (project DPI2009-08879) and projects TEC2010-09834-E and TEC2010-20307. The first author also thanks the Spanish Ministerio de Educación for an FPU doctoral scholarship. The authors especially thank the Pérez-Cabré and the Sisquella-Cabré families for their cooperation in the experimental session.



Bennett T. J., Barry C. J., “Ophthalmic imaging today: an ophthalmic photographer’s viewpoint—a review,” Clin. Exp. Ophthalmol., 37 (1), 2 –13 (2009). http://dx.doi.org/10.1111/ceo.2009.37.issue-1 1442-6404 Google Scholar


Moscaritolo M. et al., “An image based auto-focusing algorithm for digital fundus photography,” IEEE Trans. Med. Imag., 28 (11), 1703 –1707 (2009). http://dx.doi.org/10.1109/TMI.2009.2019755 ITMID4 0278-0062 Google Scholar


Marrugo A. G. et al., “Retinal image restoration by means of blind deconvolution,” J. Biomed. Opt., 16 (11), 116016 (2011). http://dx.doi.org/10.1117/1.3652709 JBOPFO 1083-3668 Google Scholar


Bernardes R., Serranho P., Lobo C., “Digital ocular fundus imaging: a review,” Ophthalmologica, 226 (4), 161 –181 (2011). http://dx.doi.org/10.1159/000329597 OPHTAD 0030-3755 Google Scholar


Abramoff M. D., Garvin M. K., Sonka M., “Retinal imaging and image analysis,” IEEE Rev. Biomed. Eng., 3 169 –208 (2010). http://dx.doi.org/10.1109/RBME.2010.2084567 Google Scholar


Marrugo A. G. et al., “No-reference quality metrics for eye fundus imaging,” CAIP 2011, Lect. Notes Comput. Sci., 6854 486 –493 (2011). http://dx.doi.org/10.1007/978-3-642-23672-3 0302-9743 Google Scholar


Gabarda S., Cristóbal G., “Blind image quality assessment through anisotropy,” J. Opt. Soc. Am. A, 24 (12), B42 –B51 (2007). http://dx.doi.org/10.1364/JOSAA.24.000B42 JOAOD6 0740-3232 Google Scholar


Bedggood P. et al., “Characteristics of the human isoplanatic patch and implications for adaptive optics retinal imaging,” J. Biomed. Opt., 13 (2), 024008 (2008). http://dx.doi.org/10.1117/1.2907211 JBOPFO 1083-3668 Google Scholar


Subbarao M., Choi T. S., Nikzad A., “Focusing techniques,” Opt. Eng. , 32 (11), 2824 –2836 (1993). http://dx.doi.org/10.1117/12.147706 OPENEI 0892-354X Google Scholar


Groen F. C. A., Young I. T., Ligthart G., “A comparison of different focus functions for use in autofocus algorithms,” Cytometry, 6 (2), 81 –91 (1985). http://dx.doi.org/10.1002/(ISSN)1097-0320 CYTODQ 0196-4763 Google Scholar


Lee S. Y. et al., “Enhanced autofocus algorithm using robust focus measure and fuzzy reasoning,” IEEE Trans. Circuits System. Video Technol., 18 (9), 1237 –1246 (2008). http://dx.doi.org/10.1109/TCSVT.2008.924105 ITCTEM 1051-8215 Google Scholar


Kautsky J. et al., “A new wavelet-based measure of image focus,” Pattern Recogn. Lett., 23 (14), 1785 –1794 (2002). http://dx.doi.org/10.1016/S0167-8655(02)00152-6 PRLEDG 0167-8655 Google Scholar


Aslantas V., Kurban R., “A comparison of criterion functions for fusion of multi-focus noisy images,” Opt. Commun., 282 (16), 3231 –3242 (2009). http://dx.doi.org/10.1016/j.optcom.2009.05.021 OPCOB8 0030-4018 Google Scholar


Liatsis P., Kantartzis P., “Real-time colour segmentation and autofocus in retinal images,” in ELMAR, 47th International Symposium, 13 –18 (2005). Google Scholar


Nayar S. K., Nakagawa Y., “Shape from focus,” IEEE Trans. Pattern Anal. Mach. Intell., 16 (8), 824 –831 (1994). http://dx.doi.org/10.1109/34.308479 ITPIDJ 0162-8828 Google Scholar


Choi K. S., Lee J. S., Ko S. J., “New autofocusing technique using the frequency selective weighted median filter for video cameras,” IEEE Trans. Consum. Electron., 45 (3), 820 –827 (1999). http://dx.doi.org/10.1109/30.793616 ITCEDA 0098-3063 Google Scholar


Subbarao M., Tyan J.-K., “Selecting the optimal focus measure for autofocusing and depth-from-focus,” IEEE Trans. Pattern Anal. Mach. Intell., 20 (8), 864 –870 (1998). http://dx.doi.org/10.1109/34.709612 ITPIDJ 0162-8828 Google Scholar


Chern N. N. K., Neow P. A., Ang M. H., “Practical issues in pixel-based autofocusing for machine vision,” 2791 –2796 (2001). http://dx.doi.org/10.1109/ROBOT.2001.933045 Google Scholar


Wallace G. K., “The jpeg still picture compression standard,” IEEE Trans. Consum. Electron., 38 (1), 18 –34 (1992). http://dx.doi.org/10.1109/30.125072 0098-3063 Google Scholar


Ramirez J. et al., “A new architecture to compute the discrete cosine transform using the quadratic residue number system,” 321 –324 (2000). http://dx.doi.org/10.1109/ISCAS.2000.857429 Google Scholar


Kristan M. et al., “A Bayes-spectral-entropy-based measure of camera focus using a discrete cosine transform,” Pattern Recogn. Lett., 27 (13), 1431 –1439 (2006). http://dx.doi.org/10.1016/j.patrec.2006.01.016 PRLEDG 0167-8655 Google Scholar


Lian Z., Er M. J., “Illumination normalisation for face recognition in transformed domain,” Electron. Lett., 46 (15), 1060 –1061 (2010). http://dx.doi.org/10.1049/el.2010.1495 ELLEAK 0013-5194 Google Scholar


Chen W., Er M. J., Wu S., “Illumination compensation and normalization for robust face recognition using discrete cosine transform in logarithm domain,” IEEE Trans. Syst. Man Cybern. B Cybern., 36 (2), 458 –466 (2006). http://dx.doi.org/10.1109/TSMCB.2005.857353 1083-4419 Google Scholar


Suppapitnarm A. et al., “A simulated annealing algorithm for multiobjective optimization,” Eng. Optimiz., 33 (1), 59 –85 (2000). http://dx.doi.org/10.1080/03052150008940911 EGOPAX 0305-215X Google Scholar


Granville V., Krivanek M., Rasson J.-P., “Simulated annealing: a proof of convergence,” IEEE Trans. Pattern Anal. Mach. Intell., 16 (6), 652 –656 (1994). http://dx.doi.org/10.1109/34.295910 ITPIDJ 0162-8828 Google Scholar


Watson A. B., “Perceptual optimization of DCT color quantization matrices,” in ICIP94, IEEE Int. Conf. Image Proc., 100 –104 (1994). http://dx.doi.org/10.1109/ICIP.1994.413283 Google Scholar
© 2012 Society of Photo-Optical Instrumentation Engineers (SPIE)
Andres G. Marrugo, Andres G. Marrugo, María S. Millán, María S. Millán, Hector C. Abril Baez, Hector C. Abril Baez, Gabriel Cristóbal, Gabriel Cristóbal, Salvador Gabarda, Salvador Gabarda, "Anisotropy-based robust focus measure for non-mydriatic retinal imaging," Journal of Biomedical Optics 17(7), 076021 (17 July 2012). https://doi.org/10.1117/1.JBO.17.7.076021 . Submission:

Back to Top