Currently, various display devices, such as the plasma display panel (PDP), liquid crystal display (LCD), light-emitting diode, active-matrix organic light-emitting diode, and stereoscopic TV, are being manufactured. The use of these display devices is becoming increasingly widespread, with the devices being rapidly adopted for laptop computers, mobile phones, high-definition TV (HD TV), and so on. Many manufacturers and consumers are interested in the attributes of these display devices, including their field of view, spatial resolution, response speed, and degree of motion blur. In addition to these kinds of quantitative characteristics, consumers expect good display capability in terms of human factors.
Researchers have previously measured the eyestrain of users watching display devices.18.104.22.168.6.7.–8 Some of these studies compared the levels of eyestrain caused by watching LCD and PDP devices based on the change in pupil size, eye blinking, and subjective tests.12.–3 Other studies investigated the relationships between the eyestrain caused by an LCD device and video factors such as brightness, contrast, saturation, hue, edge difference, and scene changes.4,5 In addition, the eyestrain caused by a stereoscopic display was examined using a subjective measurement method, optometric instrument-based measurement method, optometric clinically based measurements, and brain activity measurements.6,7 In previous research, the eyestrain caused by two- and three-dimensional (2-D and 3-D) displays was compared using the average blinking rate (BR).8 However, most previous studies did not consider human visual information, such as the gaze position and the visual field of view, for estimating eyestrain. For instance, Lee and Park measured eyestrain on the basis of the change in pupil size in relation to the changes in four adjustment factors: brightness, contrast, saturation, and hue.5 However, each factor was calculated from the whole image in the display without considering the influence of the human gaze position. Other factors, such as edge difference and scene change, were also calculated from the whole image in the display.4 In other words, these studies were conducted under the assumption that every region in a given image on the display was perceived equally by the subject. To overcome this problem, a new eye foveation model is proposed here that considers a user’s gaze position and the error of gaze detection. Three video adjustment factors—variance of hue (VH), edge, and motion information—are extracted from the successive images in the displays in which these eye foveation models are applied.
This article is organized as follows. In Sec. 2, the proposed device for gaze tracking and eye response measurement and the methods of analysis are presented. In Sec. 3, the methods for extracting video features, considering the gaze position and the foveation-based visual field of view, are explained. The experimental setup and results are presented in Sec. 4. Finally, Sec. 5 presents the conclusion of this article and the plans for future work.
Proposed Device and Analysis Methods
Device for Measuring Gaze Position and Eye Response
Figure 1 shows the proposed gaze tracking and eye response measurement device.8,910.–11 The eye-capturing camera is attached to an eyeglass frame near the lower part of one eye, as shown in Fig. 1. The camera is a small web camera with universal serial bus port that captures the images at a speed of . The spatial resolution of the captured image is . A zoom lens is used to capture the magnified images of the eye. To screen out visible light, a near-infrared (NIR) passing filter is attached to the camera lens.8,910.–11
Figure 2 shows an example of the experimental setup. Four NIR illuminators of 850 nm each are attached to an LCD display.8,910.–11 They do not affect the user’s vision because an NIR light of 850 nm does not dazzle the user’s eye. The four NIR illuminators produce four corneal specular reflections, as shown in Fig. 3, which represent the rectangular area of display since these illuminators are attached to its four corners.8,9
Gaze Tracking Method
As a user-dependent calibration, each user first gazes at a central position on the display, which is required to compensate the angle kappa, which is the angular offset between the visual and the pupillary axis.9,11 Using the captured eye image, the pupil’s center is detected on the basis of circular edge detection, local binarization, component labeling, size filtering, filling of the specular reflection area, and calculation of the geometric center of the remaining black pixels as the pupil center.910.–11 Figure 3 shows the four specular reflections of the four NIR illuminators attached to the corners of the LCD screen. These reflections are located by binarization, component labeling, and size filtering.9 The four specular reflections represent the rectangular area of the display. Therefore, on the basis of the detected pupil center and the four specular reflections, the user’s gaze position on the display is calculated according to the geometric transform between the rectangle formed by the four reflections and the rectangle of the display.9,11
Eye Response Measurement
In this research, the average eye BR is used for measuring eyestrain. In previous researches,12,13 the increase in the BR can be observed as the function of time on task. Based on these researches, previous studies measured eyestrain, with more frequent blinking corresponding to greater eyestrain.2,4 The average BR is calculated in a time window of 60 s; the time window here is moved with an overlap of 50 s.
Extraction of Video Features by Considering Gaze Position and Visual Field of View
Contrast Sensitivity Model Based on Foveation
To measure visual sensitivity according to the gaze position and angular offset, it is necessary to determine the function of retinal eccentricity. For this, previous research on visual sensitivity is referenced, which showed that visual sensitivity reduced as the distance from the gaze position increased. The algorithm for calculating sensitivity, which has been employed to improve image and video coding efficiency, is called foveation.1415.16.–17 In this research, eyestrain is measured by calculating a user’s gaze position and by determining the user’s visual information on the basis of foveation. Humans perceive a dramatic decrease in their visual sensitivity in areas away from the point of gaze. In detail, the point of gaze is perceived with high resolution, but the perceived degree of resolution is decreased according to the increase in the distance from this point. Accordingly, a foveation (visual field of view) model based on the gaze information is defined. The foveation is determined using the contrast threshold (CT) formula, which is based on human contrast sensitivity (CS) data measured as a function of spatial frequency and retinal eccentricity.1415.–161415.–16 The optimal fitting parameters are determined on the basis of previous research ( is 0.106, is 2.3, and is ).14,16 The CS is defined as the reciprocal proportion of the CT.14,16
To apply these models to an image, the eccentricity needs to be calculated for any point (pixels) in the image. Because a user’s gaze position is the foveation point, , the distance from to is given by the following equation:14,1614,16 The cut-off frequency , which is an unperceivable high-frequency component, can be obtained by setting CT as 1 (the maximum possible contrast) in Eq. (1):14,16
In Fig. 4, a brighter region represents higher contrast sensitivity.
New Foveated Weighing Model in the Wavelet Domain by Considering Gaze Detection Error14,16 The LL subregion has low-frequency components in both horizontal and vertical directions. The HH subregion includes high-frequency components in the horizontal and vertical directions. The HL subregion comprises high-frequency components in the horizontal direction and low-frequency components in the vertical direction. Finally, the LH subregion contains low-frequency components in the horizontal direction and high-frequency components in the vertical direction.18 (, ) is the error sensitivity in subband (, ); the method for calculating () is shown in Refs. 14 and 16. For a given wavelet coefficient at position [where is the set of wavelet coefficient positions existing in subband (, )], the distance from the foveation point in the spatial domain is shown in Refs. 14 and 16:
The explanations given in Eqs. (1)–(10) represent the conventional foveation model of Refs. 14 and 16, but they do not consider the errors in gaze detection when calculating the foveation model. In general, there inevitably exists an error in gaze detection between the ground-truth position and the calculated gaze position.910.–11 However, the above foveation-based visual sensitivity model of Eqs. 9 and 10 and Fig. 4 does not consider this error.
Therefore, we propose an eye foveation model that considers the gaze position and the error in detecting it, as follows. Since is the width of an image and is the viewing distance (measured in image width) from the eye to the image plane,14,16 is the calculated distance from the user’s eye to the image plane. Assuming that is the accuracy of the gaze tracking (degrees), the consequent gaze detection error is calculated as . In the range of the gaze detection error (), all the positions () should be treated as the same for the foveation (user’s gaze) position () since the error boundary is . Thus, of Eq. (10) becomes 0. Consequently, Eq. (10) is rewritten as Eq. (11), considering the gaze detection error:
Based on Eqs. (9) and (11), the foveation-based contrast sensitivity mask of the single foveation point (gaze point) in the wavelet domain is found as shown in Fig. 5(b). The four-level discrete wavelet transform (DWT) based on Daubechies wavelet bases is used. Brightness indicates the importance of the wavelet coefficients. Higher-contrast sensitivity is shown as a brighter gray level.
Extracting Video Features Considering the Eye Foveation Model
In this research, eyestrain is measured in relation to the changes in the three adjustment features of video: VH, edge, and motion information. To extract features considering gaze position and foveation, foveated images are obtained as follows.
The original color image is first separated into three images of red, green, and blue channels. These three images are decomposed using a DWT based on Daubechies wavelet bases.
The decomposed three images are multiplied by the foveation-based contrast sensitivity mask of Fig. 5(b). From these three foveated images, three images of the red, green, and blue channels in the spatial domain are obtained by the inverse procedure of DWT.18 With these three images in the spatial domain, the hue image is obtained based on the conversion matrix of RGB to hue, saturation, and intensity (HSI),18 and the VH is obtained as the first feature.
To obtain the motion component (MC) and edge component (EC), the original RGB color image is first transformed into a gray one, and the gray image is decomposed using a DWT based on Daubechies wavelet bases. The decomposed (gray) image is multiplied by the foveation-based contrast sensitivity mask of Fig. 5(b). Figure 6 shows an example of the original gray image and the corresponding foveated one by the proposed method. From the foveated image, the gray image in the spatial domain is obtained by the inverse procedure of DWT.18 The MC and EC are extracted as the second and third features, respectively, from the gray image in the spatial domain. The average magnitude calculated by the Canny edge detector in a gray image is determined as the value of EC. The average pixel difference between successive gray images is determined as the value of MC.
The VH is averaged in a time window of 60 s, and the time window is moved with an overlap of 50 s, as in the method for measuring BR. The MC and EC are also obtained by the same method. Using the calculated features of the foveated images, the eyestrain based on the average BR (Sec. 2.3) is measured in relation to changes in the three adjustment features of video: VH, MC, and EC.
Figure 7 shows some examples of extracted features in video images captured by a commercial web camera. Figure 7(a) shows an original image. Figure 7(b), 7(c), and 7(d) shows the hue image, motion image, and edge image obtained from the original one, respectively. The measured feature values of VH, MC, and EC of Fig. 7(b), 7(c), and 7(d) are 16495.05, 24.28, and 30.84, respectively.
Figure 7(e) shows an original gray image including the foveation point as a white crosshair. Figure 7(f), 7(g), and 7(h), respectively, shows the hue image, motion image, and edge image obtained from the foveated one by the conventional foveated model.14,16 The measured feature values of VH, MC, and EC of Fig. 7(f), 7(g), and 7(h) are 16879.22, 14.43, and 9, respectively.
Figure 7(i), 7(j), and 7(k), respectively, shows the hue image, motion image, and edge image obtained from the foveated one by the proposed foveated model. The measured feature values of VH, MC, and EC of Fig. 7(i), 7(j), and 7(k) are 16858.78, 15.31, and 11.15, respectively, which are different from those determined by the previous method,14,16 not considering the gaze tracking error.
To measure eyestrain in this research, a commercial 19-in LCD monitor and a commercial movie file were used. The environmental lighting condition was maintained without any external illumination. The temperature and humidity were kept constant, and there was no vibration or bad odor that could affect the experiments. Each subject watched the movie for 25 min 30 s. The data of eye response were collected from 24 subjects [average age of 26.54 (standard deviation: 2.24); minimum and maximum ages were 23 and 31, respectively]. To remove the dependency of watching distance (from the user’s eye to the monitor) while considering the actual cases of watching distances, the data of 12 subjects were obtained at a watching distance of 60 cm, and the data of the remaining 12 subjects were collected at a distance of 90 cm.
As mentioned in Sec. 2.3, in previous researches,12,13 the increase in BR can be observed as the function of time on task. Based on these researches, previous studies measured eyestrain, with more frequent blinking corresponding to greater eyestrain.2,4 Accordingly, the eyestrain based on BR was measured according to extracted features (VH, MC, and EC). To validate the relationship between these three features and eye responses, a correlation analysis was performed. In this analysis, the correlation coefficient ranges from to . A correlation coefficient close to indicates that two variables are positively related; if it is close to , it indicates that two variables are negatively related. If it is close to 0, there is no relationship between the variables. Table 1 shows the relationship between these three features and eye responses, in which the results are calculated by removing outliers based on the confidence interval of 95%. Because the scales of the VH, MC, EC, and BR are different, the values are normalized using the minimum–maximum scaling method.19
Relationship between three adjustment features and eye responses (average value of experimental data from 24 subjects).
|Eye responses||Adjustment features||Average correlation coefficient||Average gradient||Average R2|
|Blinking rate||Variance of Hue (VH)||0.4115||0.2644||0.2310|
|Motion component (MC)||−0.4059||−0.3273||0.2095|
|Edge component (EC)||−0.5078||−0.3387||0.3455|
As listed in Table 1, the average correlation coefficients between these adjustment factors (VH, MC, and EC) and BR were calculated as 0.4115, , and , respectively. Based on the average correlation coefficient in Table 1, we found that the adjustment of VH is positively related to eyestrain, whereas the adjustments of MC and EC are negatively related to eyestrain. Therefore, the increase in VH causes the increase in eyestrain, and the increase in MC and EC reduces eyestrain.
The average gradient is the slope of the fitted line by linear regression, and it represents the rate of change of VH, MC, or EC according to that of BR. The linear regressions were also performed to analyze the change in eye response in relation to the change in the adjustment factors in Table 1. On the basis of the results (average gradient) of linear regression, it is observed that if the MC or EC increases, the eyestrain decreases. In contrast, if the VH increases, the eyestrain also increases. The values between the three adjustment factors and BR were calculated as 0.2310, 0.2095, and 0.3455, respectively. In Tables 1 and 2, and Fig. 8, refers to the degree of fitting when using the regression method.20 In general, greater values of represent a better fit. Figure 8 shows the examples of 2-D dot graphs of one subject, where one dot denotes the average BR and its corresponding adjustment factors (VH, MC, and EC).
Experimental values from 24 subjects.
|Subject number||Correlation coefficient||Gradient||R2|
Because the -intercept points of the fitted lines (the point where the fitted line is intercepted with the -axis) and the degrees of distributions of all data of the 24 subjects are different for each subject, it is difficult to obtain a meaningful result from the average of all subjects. Instead, we included both the average results and all the results of the 24 subjects in Tables 1 and 2, respectively.
Figure 9 shows the examples of gaze detection results. The circles represent the reference points at which each subject should look, and crosshairs show the gaze points that are calculated by our gaze detection algorithm (explained in Sec. 2.2). A total of five subjects tried to look at the nine reference points five times, and each crosshair shows the average point of five trials per each subject. We measured the gaze detection error as the angle between the vector to the reference point and the vector to the calculated gaze position. The gaze detection error between the reference and gaze points was about 1.12 deg. As seen in Fig. 9, the reference points show differences from the calculated gaze points. In other words, the gaze error for each subject can occur randomly inside the circle whose radius is 1.12 deg, and we consider this circle in the case of generating the eye foveation model. Therefore, the eye foveation model without this gaze detection error, as shown in Fig. 5(a), is different from the proposed eye foveation model, which considers the gaze detection error, as shown in Fig. 5(b).
This research introduced a new eyestrain measurement method considering an eye foveation model. On the basis of this measurement, it was confirmed that a stable relationship exists between the eyestrain and the three adjustment factors—color information, edge, and motion information. Experimental results showed that a greater degree of VH induced higher eyestrain. On the contrary, a greater degree of the EC and MC induced relatively lower eyestrain. With the recent developments in television technology, the smart TV, which includes a built-in camera, has become widespread. On the basis of the results of this research, an intelligent display can be expected that has the functionality of reducing the user’s eyestrain by decreasing the VH or increasing the edge and motion information of a video based on the eye response measured by the built-in camera.
In future works, the relationship between eyestrain and video factors in various kinds of displays, such as 3-D stereoscopic or holographic displays, would be researched on the basis of gaze detection and the proposed foveation model.
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (Grant No. 2012R1A1A2038666).
Won Oh Lee received a BS degree in electronics engineering from Dongguk University, Seoul, South Korea, in 2009. He is currently pursuing the combined course of Master and PhD degree in electronics and electrical engineering at Dongguk University. His research interests include biometrics and pattern recognition.
Hwan Heo received the BS degree in computer engineering from National Institute for Lifelong Education, Seoul, South Korea, in 2009. He is currently pursuing the combined course of Master and PhD degree in electronics and electrical engineering at Dongguk University. His research interests include image processing, computer vision, and HCI.
Eui Chul Lee received his BS degree in software in 2005, and his Master and PhD degrees in computer science in 2007 and 2010, respectively, from Sangmyung University, Seoul, South Korea. He is currently an assistant professor in the Department of Computer Science at Sangmyung University. His research interests include computer vision, biometrics, image processing, and HCI.
Kang Ryoung Park received his BS and Master degrees in electronic engineering from Yonsei University, Seoul, Korea, in 1994 and 1996, respectively. He also received his PhD degree in computer vision from the Department of Electrical and Computer Engineering, Yonsei University, in 2000. He was an assistant professor in the Division of Digital Media Technology at Sangmyung University until February 2008. He is currently a professor in the Division of Electronics and Electrical Engineering at Dongguk University. His research interests include computer vision, image processing, and biometrics.