Users of visual display terminals (VDTs) commonly complain of visual fatigue after prolonged work in front of the computer. The term “computer vision syndrome” was coined to describe the diverse symptoms reported by computer users including eyestrain, tired eyes, irritation, a burning sensation, dry eye, redness, blurred far vision, and double vision.1 Among these, dry eye is the most frequently reported ocular complaint of VDT users.2,3 Computer use has been associated with an alteration of the blinking patterns and with a larger palpebral aperture, which is influenced by screen position. The joint contribution of both factors results in a greater exposure of the ocular surface to the environment and in an increased tear film evaporation and instability, leading to dry eye symptomatology.4
Spontaneous eye blink rate (SEBR), which is usually measured in blinks per minute (blinks/min), has been found to be a very sensitive parameter to changes in the cognitive demands. For instance, SEBR was observed to have increased from while reading to at rest, with a further increment to during conversation.5 Similarly, several authors described a sharp decrease in SEBR when subjects perform a highly demanding task with the computer. Indeed, Skotte et al.6 noted a change in SEBR from 16 to when comparing passive (watching a film) to active computer tasks (this required subjects to connect a sequence of small dots on the screen). Similarly, Himebaugh et al.7 evaluated SEBR while participants conducted a series of low to high levels of concentration tasks on VDT (looking at a blank computer screen or watching a film and playing a computer game or viewing a series of rapidly changing letters, respectively). They observed a comparatively reduced blinking rate during the high concentration activities, in addition to a higher level of fluctuation in SEBR values, particularly during the computer game trial.
Given its widespread application in multiple fields of science, psychologists, psychiatrists, ophthalmologists, and neurophysiologists have studied blinking of the human eye for decades. Some authors used electro-oculography for this purpose,6 that is, a relatively complex technique, not easily applicable for blink monitoring in a real-life working environment. However, in more recent times, the incorporation of cheap integrated cameras in computers suggests the possibility of using image processing techniques for the evaluation of SEBR instead of using other more invasive or intrusive methods. Several efforts have subsequently been conducted to develop automatic blink detection strategies.
Won et al.8 described a blinking detection algorithm based on binary images. The binarization threshold is critical and depends on the illumination conditions of the image; that is, it requires a previous normalization process whereby the threshold is automatically determined.9 Won used two features to detect whether the eye was open or closed: first, consecutive frames were compared to determine the number of cumulated black pixels, since in the open eye conditions the presence of the iris/pupil leads to a greater number of black pixels; second, the relation between iris height and width was measured. These two factors were combined using a support vector machine to determine the frames, and thus the time during which the eyelids were closed.
Similarly, Jiang et al.10 were able to detect the beginning and the end of an eye blink. The difference between two consecutive frames was binarized, and morphological operations were employed to determine the presence of the iris. The detection of the iris was based on dimension parameters requiring the definition of several thresholds, which needed to be optimized in advance. With optimal values for these thresholds, the authors reported true-positive rates (TPRs) of 90.3% and false-positive rates (FPRs) of 0.1%; this is an accuracy of 99.7% using their technique. A level of precision of 66% was reported by Tan and Zhang,11 in their proposed method for iris detection through pattern recognition, which was subsequently improved to take into account the different configurations resulting from the actual position of the iris with respect to the maximal response zone.12 With this method, the authors reported an accuracy of 88% in blink detection.
Finally, Mitelman et al.13 developed the semiautomatic eye state detection algorithm, with which the authors were able to detect the differences between open and closed eye conditions by examining the corresponding brightness and the frequency distribution of the image. This method, which requires a training process to define several thresholds, relied on brightness peaks arising from the iris and the pupil regions. Later, Bernard et al.14 implemented an accurate image processing analysis to detect the two lines that correspond to the margins of the eyelids, whereupon the distance between these two lines was monitored to identify eye blinks.
Blinking Supervision with Image Processing
The blink counting algorithm that was developed in this study consists of a combination of known image processing algorithms, with the addition of a new algorithm which was inspired by the work of Jiang et al.10 The present algorithm is conceptually divided into two tasks: eye segmentation and blink counting.
The first task, eye segmentation, is to carry out two key procedures: eye detection and eye tracking. This combination of procedures improves the efficiency, in terms of the actual computing time required for eye detection in each frame, since eye tracking only requires a portion of the image to operate. The second task also involves the combination of two algorithms: iris detection and the iris height/width ratio evaluation. Redundancy was introduced to improve the accuracy and to avoid false blinking detection results (false positives). Blinks are only counted when both the algorithms detect a blink within the same set of consecutive frames. A detailed description of these algorithms is given in the following sections.
For eye detection, the rapid object detection algorithm developed by Viola and Jones15 is applied. This algorithm was first created to identify faces that are in an image, using a learning cascade feature detector. The implementation in MATLAB® (MathWorks, Inc., Natick, Massachusetts) of this algorithm can detect eyes, mouths, and noses, and it may also be trained to detect other user-defined objects (facial features). The algorithm works by locating the left eye of the subject in the first frame, whereupon the eye tracking algorithm becomes active until the eye has disappeared (see Fig. 1). At that time, the cascade learning feature detector recommences and the process continues. It must be noted that blinking does not interfere with eye detection. In fact, even if the iris is lost during the interval of a blink, the eye is still detected.
After the eye is detected, the region where the eye is located is used as the input region for the Kanade–Lucas–Tomasi feature tracking algorithm.16,17 The MATLAB® implementation of this algorithm is very thorough, including the tracking features, and it also allows updates in regard to the size and the location of the feature search space, according to the relation between the positions of these features in consecutive frames.
In the present application of the algorithm, the configuration that is considered optimal is as follows: the number of pyramid levels where the tracking points are looked for is 4; the maximum bidirectional error, which is a parameter to help check good tracking points and eliminate uncertain ones, is 2; the maximum number of search iterations is 40; and the type of assumed transformations between frames is similarity, which means changes in scale, position, and orientation of the object are allowed.
Finally, provided that the eye is successfully tracked from one frame to the next one, the region where the eye is located is used as the input region for blink detection. Blinks are counted in the next two algorithms.
The aim of this algorithm is to identify and segment the iris in each frame so that when the iris is not detected, the algorithm assumes a blink has taken place. This algorithm is inspired by the work of Jiang et al.,10 although several important modifications were implemented. First, the luminosity of the image is normalized using a median filter, after which the Otsu18 optimal threshold binarization is applied. Finally, the eyebrow is erased with a mask and the borders are removed (to eliminate fortuitous portions of glasses, hair, and so on) (see Fig. 2).
At this point of the process, the image contains a black eye shape with fragments of the eyelids and, less frequently, of the eyebrows. In order to remove all but the iris (see Fig. 3), an opening process9,18 is applied with a disk structuring element of four pixels of radius. This operation keeps the iris and sometimes other round-shaped elements. Thereupon, the image is labeled, and only the largest object is kept. Provided that the iris is the only visible shape, an erosion process9,18 is subsequently applied, which results in the homogenization of the shape of the iris.
Finally, the Hough transform9,18 is used to detect circular shapes in the image and thus to segment the iris. If one circular shape is detected, a no-blink condition is registered. Conversely, when a circle is detected in a frame but lost in the following one, the algorithm registers the beginning of a blink. The sensitivity of the circular Hough transform, which is set to 0.82, determines whether the shape under consideration is circular or not.
It must be noted that this algorithm is not perfect. Indeed, in some cases, the iris is not properly segmented from the surrounding anatomical structures, such as the margins of the eyelids, thus failing to detect a circular shape, which leads to a false blink count (false positive). Therefore, as noted previously, the iris detection algorithm is combined with a second algorithm based on the iris height/width ratio evaluation in order to improve the accuracy in blink detection.
Iris Height/Width Ratio Evaluation
This algorithm computes and compares the width and height of the iris. It is based on the work of Won et al.,8 although only the iris is used in the present modification of the algorithm, whereas these authors assessed the entire eye. Iris width and height were selected since their ratio suffers significant changes during an eye blink.
The maximum horizontal width and vertical height of the iris are measured from the image obtained in the last step of the iris detection algorithm. Assume the algorithm is processing frame . Since the image contains a round-shaped object, the first column, starting left, that contains a black pixel is the leftmost end of the iris, . Similarly, the rightmost, top, and bottom ends are identified, , , and . The height–width ratio of frame is
It must be noted that, to ensure the correct detection, this equation contains the parameter , which needs to be adjusted depending on the illumination conditions, the camera configuration, the distance to the subject, and other factors too. is a value between 0 and 1, typically around 0.9.
Algorithm Testing (Preliminary Results)
Preliminary trials revealed that the current version of the algorithm was not fast enough to be useful for real-time video stream monitoring. Consequently, in its current state of development, it was only used with recorded videos.
The algorithm was tested on 17 one-minute videos of subjects undertaking different actions on personal computers (reading texts, playing games, browsing the web, and so on). All participants provided written informed consent after the nature of the study was explained to them.
A variety of illumination conditions, working distances, face configurations (with and without glasses), skin tones, and webcam resolutions were included in the preliminary trials to assess the performance of the algorithm in less than ideal conditions, albeit closer to real-life ones. Each video was manually revised to determine the true blink count, and then this value was compared with the value obtained by our algorithm to calculate true blink positives (TP), false blink positives (FP), true blink negatives (TN), and false blink negatives (FN). Furthermore, true-positive  and false-positive rates  were determined to compare the performance of our algorithm to that of previously described algorithms.
After reviewing the 17 one-minute videos, the total number of blinks was 269, with a range from 2 to 45 per video. The mean TPR was 87.54% and the mean FPR was 0.19%, in accordance with the published report by Jiang et al.10 The range of TPR was from 30% to 100%, and the range of FPR was from 0% to 1.7%. It must be noted that the worst values of TPF and FPR corresponded to the combination of very challenging illumination conditions, dark skin tone with features of interest that were more difficult to discriminate from background, and low camera resolutions, resulting in video captures in which it was very difficult to observe the eye of the participants. In contrast, TFP and FPR values were close to 100% and 0%, respectively, provided that illumination conditions were close to those recommended by ergonomic standards such as, for example, ISO 9241-6,19 which notes that the average room illumination should be between 320 and 600 lx, uniform, and without large differences between the surrounding environment and the workstation. In addition, a webcam resolution of at least 720 pixels was considered a requirement for quality image acquisition. Given that these minimum criteria were met, it was found that the parameter did not need for further adjustments prior to video analysis.
For instance, good blink detection conditions are shown in Fig. 4, in which the subject blinked 19 times. Although our algorithm slightly overestimated the number of true blinks, all real blinks were detected, resulting in a TPR of 100% and a FPR of 0.52%. Conversely, Fig. 5 depicts a more challenging situation, in which the combination of darker skin tone, low webcam resolution, and unsatisfactory illumination conditions leads to an underestimation of true blinks (only 9 of 23 real blinks were successfully detected), with TPR and FPR rates of 30.43% and 0.06%, respectively.
The present research aims to develop and implement an algorithm for automatic blink detection and counting. Preliminary trials on recorded videos show good sensitivity of the algorithm to detect blinks, provided that normal illumination conditions and webcam resolutions are present. Given the relevance of blink frequency in the visual fatigue symptoms experienced by most computer users, noninvasive and nonintrusive blink monitoring strategies are a first step toward developing biofeedback mechanisms for blink re-education. The innovation of the present algorithm relies on requiring the configuration of only one parameter, , which may be kept constant if the workplace has normal illumination conditions, and on being functional on most computer-integrated webcams, thus supporting the need for further research to advance its implementation on other ubiquitous devices, such as tablets and smart phones. Further research is being carried out to make the algorithm operational on real-time video streaming and with standard computing languages and tools. An application incorporating biofeedback for blinking re-education is currently under development.
Authors were grateful to all the staff and students of the Secondary School Josep Lladonosa of Lleida, Spain, who have participated in the video recording process to collaborate in this research study. E. Pérez and G. Cardona thank the Spanish Ministerio de Economía y Competitividad and Fondos FEDER for financial support (Project Number DPI2013-43220-R).
Bernardo Morcego is an associate professor at the Universitat Politècnica de Catalunya (UPC). He received his PhD in computer science from the UPC in 2000. He has been teaching several subjects in the area of automatic control in the schools of engineering and aeronautics in Terrassa and Barcelona. He is a member of the Research Center for Supervision, Safety, and Automatic Control of UPC. His research interests include UAV control systems and computer vision applications.
Marc Argilés graduated in optometry in 2009 and was awarded a Master of Science in vision in 2011 by the UPC. He is currently undertaking a PhD in optical engineering at UPC, with research interests related to dry eye symptoms, computer vision syndrome, and the relationship of ocular blinking and cognition. He is actively involved in various projects linking new visual display terminals with related vision problems.
Marc Cabrerizo received his degree in industrial electronics and automatic control from the UPC, Terrassa, Spain, in 2014. He is currently working as PLC and robot programmer in a Spanish manufacturer of packaging machinery (SR INNOVA) located in Barcelona, Spain.
Genís Cardona received his degree in optometry from the UPC in 1992, and MSc and PhD degrees from the University of Manchester Institute of Science and Technology, UK, in 1994 and 1996, respectively. He is currently employed as a full-time lecturer at the Department of Optics and Optometry in the UPC. His research interests include ocular surface and tear film, contact lenses, refractive surgery, blinking, and intraocular lenses.
Ramon Pérez received his MSc degree in physics from the University of Barcelona in 1993 and his PhD in physics from UPC, Terrassa, Spain, in 2003. He currently holds a position as a lecturer at the Department of Automatic Control in the same university. His current research interests include control and supervision, particularly focused in water systems. He is a part of the Advanced Control Systems (cs2ac) research group at the UPC and of the Technological Center of Manresa (CTM).
Elisabet Pérez-Cabré received her PhD in physics from the UPC in 1998. Her academic activities at the School of Optics and Optometry in UPC involve lecturing, mainly on fundamental optics and optical instruments. Her current research interests include encryption techniques, programmable diffractive optical elements, and biomedical optics. She is a senior member of the International Society for Optical Engineering (SPIE). She is also a member of the Spanish Optical Society (SEDOPTICA) and the European Optical Society (EOS).
Joan Gispets was awarded his degree in optometry from the UPC in 1992, his MSc degree in optometry and vision science from the University of Manchester in 1993, and his PhD from UPC in 2009. He has been a faculty member at the Department of Optics and Optometry at UPC since 1995. He is currently dean of the faculty. His research interests are related to contact lenses, keratoconus, noninvasive diagnostic techniques, and myopia.